Linux Needs GC Lingua Franca(s) to Win
Keith Curtis joins us once again for a discussion on Linux. A year ago, he spoke of How Linux Could Achieve Faster World Domination. Now he's back with a more focused view on just what Linux needs to pull ahead.
If we were already talking to our computers, etc. as we should be, I wouldn’t feel a need to write this to you. Given current rates of adoption, Linux still seems a generation away from being the priceless piece of free software useful to every child and PhD. This army your kernel enables has millions of people, but they often lose to smaller proprietary armies, because they are working inefficiently. My mail one year ago listed the biggest workitems, but I realize now I should have focused on one. In a sentence, I have discovered that we need GC lingua franca(s).
Every Linux success builds momentum, but the desktop serves as a powerful daily reminder of the scientific tradition. Many software PhDs publish papers but not source, like Microsoft. I attended a human genomics conference and found that the biotech world is filled with proprietary software. IBM’s Jeopardy-playing Watson is proprietary, like Deep Blue was. This topic is not discussed in any of the news articles, as if the license does not matter. I find widespread fear of having ideas stolen in the software industry, and proprietary licenses encourage this. We need to get these paranoid programmers, hunched in the shadows, scribbled secrets clutched in their fists, working together, for any of them to succeed. Windows is not the biggest problem, it is the proprietary licensing model that has infected computing, and science. Desktop world domination is not necessary, but it is sufficient to get robotic chaffeurs and butlers.
There is, unsurprisingly, a consensus among kernel programmers that usermode is “a mess” today, which suggests there is a flaw in the Linux desktop programming paradigm. Consider the vast cosmic expanse of XML libraries in a Linux distribution. Like computer vision, there are not yet clear places for knowledge to accumulate. It is a shame that the kernel is so far ahead of most of the rest of user mode.
The most popular free computer vision codebase is OpenCV, but it is time-consuming to integrate because it defines an entire world in C++ down to the matrix class. Because C/C++ didn’t define a matrix, nor provide code, countless groups have created their own. It is easier to build your own computer vision library using standard classes that do math, I/O, and graphics, than to integrate OpenCV. Getting productive in that codebase is months of work and people want to see results before then. Building it is a chore, and they have lost users because of that. Progress in the OpenCV core is very slow because the barriers to entry are high. OpenCV has some machine learning code, but they would be better delegating that out to others. They are now doing CUDA optimizations they could get from elsewhere. They also have 3 Python wrappers and several other wrappers as well; many groups spend more time working on wrappers than the underlying code. Using the wrappers is fine if you only want to call the software, but if you want to improve the underlying code, then the programming environment instantly becomes radically different and more complicated.
There is a team working on Strong AI called OpenCog, a C++ codebase created in 2001. They are evolving slowly as they do not have a constant stream of demos. They don’t consider their codebase is a small amount of world-changing ideas buried in engineering baggage like STL. Their GC language for small pieces is Scheme, an unpopular GC language in the FOSS community. Some in their group recommend Erlang. The OpenCog team looks at their core of C++, and over to OpenCV’s core of C++, and concludes the situation is fine. One of the biggest features of the ROS (Robot OS), according to its documentation, is a re-implementation of RPC in C++, not what robotics was missing. I’ve emailed various groups and all know of GC, but they are afraid of any decrease in performance, and they do not think they will ever save time. The transition from brooms to vacuum cleaners was disruptive, but we managed.
C/C++ makes it harder to share code amongst disparate scientists than a GC language. It doesn’t matter if there are lots of XML parsers or RSS readers, but it does matter if we don’t have an official computer vision codebase. This is not against any codebase or language, only for free software lingua franca(s) in certain places to enable faster knowledge accumulation. Even language researchers can improve and create variants of a common language, and tools can output it from other domains like math. Agreeing on a standard still gives us an uncountably infinite number of things to disagree over.
Because the kernel is written in C, you’ve strongly influenced the rest of community. C is fully acceptable for a mature kernel like Linux, but many concepts aren’t so clear in user mode. What is the UI of OpenOffice when speech input is the primary means of control? Many scientists don’t understand the difference between the stack and the heap. Software isn’t buildable if those with the necessary expertise can’t use the tools they are given.
C is a flawed language for user mode because it is missing GC, invented a decade earlier, and C++ added as much as it took away as each feature came with an added cost of complexity. C++ compilers converting to C was a good idea, but being a superset was not. C/C++ never died in user mode because there are now so many GC replacements, it created a situation paralyzing many to inaction, as there seems no clear place to go. Microsoft doesn’t have this confusion as their language, as of 2001, is C#. Microsoft is steadily moving to C#, but it is 10x easier to port a codebase like MySQL than SQL Server, which has an operating system inside. C# is taking over at the edges first, where innovation happens anyway. There is a competitive aspect to this.
Lots of free software technologies have multiple C/C++ implementations, because it is often easier to re-write than share, and an implementation in each GC language. We all might not agree on the solution, so let’s start by agreeing on the problem. A good example for GC is how a Mac port can go from weeks to hours. GC also prevents code from being able to use memory after freeing, free twice, etc. and therefore that user code is less likely to corrupt your memory hardware. If everyone in user mode were still writing in assembly language, you would obviously be concerned. If Git had been built in 98% Python and 2% C, it would have become easier to use faster, found ways to speed up Python, and set a good example. It doesn’t matter now, but it was an opportunity in 2005.
You can “leak” memory in GC, but that just means that you are still holding a reference. GC requires the system to have a fuller understanding of the code, which enables features like reflection. It is helpful to consider that GC is a step-up for programming like C was to assembly language. In Lisp the binary was the source code — Lisp is free by default. The Baby Boomer generation didn’t bring the tradition of science to computers, and the biggest legacy of this generation is if we remember it. Boomers gave us proprietary software, C, C++, Java, and the bankrupt welfare state. Lisp and GC were created / discovered by John McCarthy, a mathematician of the WW II greatest generation. He wrote that computers of 1974 were fast enough to do Strong AI. There were plenty of people working on it back then, but not in a group big enough to achieve critical mass. If they had, we’d know their names. If our scientists had been working together in free software and Lisp in 1959, the technology we would have developed by today would seem magical to us. The good news is that we have more scientists than we need.
There are a number of good languages, and it doesn’t matter too much what one is chosen, but it seems the Python family (Cython / PyPy) require the least amount of work to get what we need as it has the most extensive libraries: http://scipy.org/Topical_Software. I don’t argue the Python language and implementation is perfect, only good enough, like how the shape of the letters of the English language are good enough. Choosing and agreeing on a lingua franca will increase the results for the same amount of effort. No one has to understand the big picture, they just have to do their work in a place where knowledge can easily accumulate. A GC lingua franca isn’t a silver bullet, but it is the bottom piece of a solid science foundation and a powerful form of social engineering.
The most important thing is to get lingua franca(s) in key fields like computer vision and Strong AI. However, we should also consider a lingua franca for the Linux desktop. This will help, but not solve, the situation of the mass of Linux apps feeling dis-integrated. The Linux desktop is a lot harder because code here is 100x bigger than computer vision, and there is a lot of C/C++ in FOSS user mode today. In fact it seems hopeless to me, and I’m an optimist. It doesn’t matter; every team can move at a different pace. Many groups might not be able to finish a port for 5 years, but agreeing on a goal is more than half of the battle. The little groups can adopt it most quickly.
There are a lot of lurkers around codebases who want to contribute but don’t want to spend months getting up to speed on countless tedious things like learning a new error handling scheme. They would be happy to jump into a port as a way to get into a codebase. Unfortunately, many groups don’t encourage these efforts as they feel so busy. Many think today’s hardware is too slow, and that running any slower would doom the effort; they do not appreciate the steady doublings and forget that algorithm performance matters most. A GC system may add a one-time cost of 5-20%, but it has the potential to be faster, and it gives people more time to work on performance. There are also real-time, incremental, and NUMA-aware collectors. The ultimate in performance is taking advantage of parallelism in specialized hardware like GPUs, and a GC language can handle that because it supports arbitrary bitfields.
Science moves at demographic speed when knowledge is not being reused among the existing scientists. A lingua franca makes more sense as more adopt it. That is why I send this message to the main address of the free software mothership. The kernel provides code and leadership, you have influence and the responsibility to lead the rest, who are like wandering ants. If I were Linus, I would threaten to quit Linux and get people going on AI ;-) There are many things you could do. I mostly want to bring this to your attention. Thank you for reading this.
-Keith
Curtis spent 11 years as a Software Design Engineer at Microsoft before examining Linux and the open source side of things, which resulted in a change of perspective and a published book. See more about his book here, including a link to a free PDF version.
This content originally appeared on Keith Curtis' blog.
- Linux,
- world ,
- domination ,
- keith ,
- curtis
- Google Announces $8.58 Billion in Gross Revenue
- E3 Rumor: Powerful Wii HD with HD Controller
- ECGC 2011: The Future of the Gaming Industry
- Bid on This System, Help Japan!
- Toshiba's New HDDs Destroy Data Automatically
- Deals for April 15: $400 Off HP PCs Coupon Code
- ECGC 2011: Is the Gaming Market About to Crash?
- Intel Ivy Bridge Getting USB 3.0, Thunderbolt
- Internet Explorer 10 to Ignore Windows Vista
- Bill Gates Didn't Understand Gmail
- Intel and Micron Announce 20nm NAND Flash
- Windows 8 to Make USB Portable Workspace
- RAGE "Anarchy Edition" Upgrade Free, Video
- Deals for April 18: 25% Off HP PCs Extravaganza
- ECGC 2011: NC Is A Developer Hog Pen
- Seagate May Buy Samsung HDD Unit
- Major Retailers Cut the Price of the Nintendo Wii
- Asus: Yes, We're Planning an Oak Trail Tablet







The only thing linux needs is acceptance as an option for an OS on the prebuilt granny computers: DELL/HP/Etc Etc.. Once you have it as an option as standard equipment with those companies... its on.
Oh, and I suppose another problem is there is too many damn flavors of linux out there. If one can rise above the rest like Ubuntu is starting too, the chances of linux going mainstream go up
I wonder what he thinks of the D programming language.
Totally agree. The Linux environment has to converge in some way, a common set of libraries would be a good choice.
It doesn't have to be a specific language if the library can be used in it (Code in Python, library in C/C++).
Duplication of core features...imagine you had a set of libraries for everything you want (like VB programers tell us), coding would be much faster and easier.
personally i'm creating my own programming language just for fun. its a lot of hard work. gets crazy at times. which makes me appreciate all other languages out there.
Linus will never be mainstream and the main reason, amongst many, is too many kernels.
The only thing linux needs is acceptance as an option for an OS on the prebuilt granny computers: DELL/HP/Etc Etc.. Once you have it as an option as standard equipment with those companies... its on.Oh, and I suppose another problem is there is too many damn flavors of linux out there. If one can rise above the rest like Ubuntu is starting too, the chances of linux going mainstream go up
If you want it on "granny" computers you will have to do away with terminal because the target audience is too comfortable with GUIs. To be frank, the terminal is what makes linux great but also what scares the average joe away.
If you want it on "granny" computers you will have to do away with terminal because the target audience is too comfortable with GUIs. To be frank, the terminal is what makes linux great but also what scares the average joe away.
you mean a unix-based OS without a terminal? They have that and it's apple. As far as linux goes, you can always hide the terminal. i mean it is linux... you can do whatever the hell you want.
you mean a unix-based OS without a terminal? They have that and it's apple. As far as linux goes, you can always hide the terminal. i mean it is linux... you can do whatever the hell you want.
True, but there are just too many things that are easier to do in terminal or don't have a GUI component. Just installing packages is usually done with something like apt-get or yum. The first thing the community (or what the average joe will call tech support) tells you to do to fix anything is open a terminal window. Linux just isn't as computer illiterate friendly as it could be.
ivan_chess: Apparently you haven't used Linux in the past few years. You can download Ubuntu, install it through the Ubiquity GUI, then do all of your web browsing, photo viewing, etc... without touching the terminal. In the event something doesn't work correctly, the tech person fixing it may resort to using the command line, however, the same thing applies to Windows, so this isn't really Linux-specific. I do Linux development, and if I use the command-line, it's purely by choice, as there are GUIs for everything.
The terminal argument is every bit as outdated as the "OMGz, you must get Nvidia if you run Linux, because their drivers R L337", even though in 2010, AMD's Linux drivers piss all over Nvidia, and AMD supports the open driver, whereas Nvidia is opposed to the open source Nouveau driver.
What does GC stand for?
you mean a unix-based OS without a terminal? They have that and it's apple. As far as linux goes, you can always hide the terminal. i mean it is linux... you can do whatever the hell you want.
OS X has a terminal - I use it all the time...
What does GC stand for?
Garbage Collection. They really should have stated it in the article. I am familiar with the concept but not with the internal politics of Linux development, so I read most of it not knowing what the hell they were talking about until they mentioned C#.
True, but there are just too many things that are easier to do in terminal or don't have a GUI component. Just installing packages is usually done with something like apt-get or yum. The first thing the community (or what the average joe will call tech support) tells you to do to fix anything is open a terminal window. Linux just isn't as computer illiterate friendly as it could be.
Funny but that's often what tech support wants you to do on Windows. Having Internet problems? Well you'll often get told to run a trace route to see if the problem is at a specific hop. Some thing just don't receive a GUI because there is no reason to add one. A GUI should be used when needed, not just for the sake of having one. Too much software has a GUI that is so poorly laid out that it would be quicker to learn a few commands and bash (pun intended) them into the command line.
you mean a unix-based OS without a terminal? They have that and it's apple. As far as linux goes, you can always hide the terminal. i mean it is linux... you can do whatever the hell you want.
Not true. You can open a terminal in Mac OS X. That's how I do my work there after all... (Though I usually just use an SSH connection to be frank, I definitely prefer Windows.) I'm a software developer if that wasn't clear
AS far as commenting on the article, I certainly try to steer clear of integrating open source code within my code. Working on proprietary codebases and all...
What struck me is the author never defined what GC means even though it's the central theme to the article. For those that haven't yet figured it (and it took me a long time to figure it out myself even though I'm a software developer), it means Garbage Collection - automatically reclaiming memory that the program no longer uses.
Finally, I was under the impression that such GC lingua franca already exists -- it's called Java...
Eiffel language was the best, but "not invented here" syndrome all but killed it.
All the goodies of C++ none of the bad sides. Perfect GC
Environment to operate with incremental compiler was amazing back then when computers were slow.
The transparent design/programming is still...
At least one userbase built on linux kernel has GC and clear default set of libraries. It's just that advantages of this did not occur to "community", but to another for-profit-company: Google. And they don't even call it linux.
---
While at that. GC isn't good for everything, because at least implementations that I know of have to "stop the world" for at least some time during GC, which results in annoying pause while, for example, playing your game. As for leaks, tools like valgrind make it trivial to detect most of them.
Linus will never be mainstream and the main reason, amongst many, is too many kernels.
Linus won't become mainstream. He prefers to spend time with his family and isn't by choice a public figure.
Linux has actually the opposite advantage: it's released as one main version. Not many as you seem to suggest. Linux releases follow a set schedule. That Linux is easily modified makes it just as suitable for standard as obscure hardware/implementations. I think you've misunderstood something fundamental about the nature of the Linux kernel. From kernel source you can compile it support practically all known platforms; hence you don't need to maintain different kernel sources.
Microsoft on the other hand actually has a situation of not compatible kernels developed separately, even though they share some elements. That has also been pointed out as a hinder for how well Microsoft will adopt to the fast moving markets of smart-phones and tablets.
The reason the Linux kernel itself is coded in C is simple: C doesn't forgive errors. Therefore Linus sees no benefit in coding in a higher level language that eventually would add more garbage, poor quality, code.
after 5 paragraphs I gave up. what the hell is the guy talking about ? a good way to start an article is to describe the base you are building upon.
this seems to be written by a geek that never sees the light of day and has problems forming coherent senteces.
I guess those 11 years at MS did leave a scar on his soul.
I'm a software engineer. I received my degree from a top state university. I've used "Linux" (Ubuntu and Redhat mostly), Macs, and Windows. I develop applications on Windows because of what I am about to say. I just want to make it clear, I'm not anti-linux or trying to bash it. There are tons of reasons to love it. However, as an average user there is a lot of reasons why you will be turned off before you even really start to learn it.
Linux isn't a generation away, it's a revolution away.
Most of the OSs that are put on top of the kernel are ugly, hard to navigate, and full of other usability nightmares. The learning curve going from a Windows or Mac environment is seriously steep. Sure, you can Google "how to do _________ in (insert distro name here)" and there is probably an idiot proof video on youtube, but you will be surprised how few home users can even do that.
That isn't a function of what tools are used to develop applications. Linux software has too many engineers working on it and too few designers.
It's really that simple and until people on the software side of this understand that, it will never get any closer to Windows or Macintosh OS (as a desktop OS - because cellphones and other devices are flocking to Linux based OSs)
Seriously, I don't think the need for garbage collection is the biggest problem Linux has these days.
True, but there are just too many things that are easier to do in terminal or don't have a GUI component. Just installing packages is usually done with something like apt-get or yum. The first thing the community (or what the average joe will call tech support) tells you to do to fix anything is open a terminal window. Linux just isn't as computer illiterate friendly as it could be.
Thats one of the main problems that I have noticed also, I have gotten a few people to try linux, especially when they kept messing up their windows install.
I set up a windows and ubuntu dual boot and none of them could use ubuntu for a length of time. They easily got frustrated when they came to a point when something required command line.
(for all those saying that you don't need to use command line if you don't want to, do not understand that not every bit of code will come in a nice easy to install .deb
you are very likely to get a tar.gz that takes a ton of work in command line to install
Another annoyance is that many command line processes such as installing a program, will require you to do the same things, this is how a tutorial can tell you step by step how to install a tar.gz or other format, you can even copy and paste the commands in and just change the names
If they want more people to use linux then it needs to get rid of the useless work.
For example, if a user fines a awesome new app and it is not a .deb,
when the user double clicks on the tar file or the rpm file or what ever other annoying format is used, it will not provide them with the option to install or run the program.
What the OS needs to do is pop up something like, I see that you are trying to use this tar.gz file, would you like me to install it for you?
linux needs more automation.
Windows and mac are so popular because they do a good job of keeping the users from ever having to touch the command line.
While I have no problem using command line, I would prefer to not have to use it as it requires more button presses and time compared to a GUI where you just click.
For example, there some programs that do not have a GUI and you have to use command line, and you will often see tutorials and forum posts about people trying to get the software to work. it will require multiple commands to be entered.
Then once in a while, someone will make a GUI front end which basically does the command like for you. and then the complaints stop because the user can achieve the same work with far less work.
For example, If I am testing the security of my WPA2 network, to see how short can go with the passwords that I get from GRC, before the password becomes too weak. I can get the app needed to do this and I can type in over 20 commands in order to start the process, or I can install the gui front end and simply select my network from a list, then click on the security type then click start and it does the rest, it turns a 20+ step process into a 3 step process
Windows is popular because probably 99.999% of all programs have a GUI and either run with out even installing from a easy to use exe file or can be installed by simply clicking next a few times.
When I first show Ubuntu or Mint to somebody they're always impressed at how quickly it installs and how everything works. After a while they start running into annoyances and problems and it's not long before they go back to Windows (even XP).
Also if you're a gamer you can pretty much forget Linux. Same for web developers - they need to test sites in IE and Safari and some people need Photoshop and Flash.
I don't see why Linux is trying to get mass acceptance anyway, not like people are going to start paying for it. Linux already runs on over 90% Top 500 supercomputers, it runs majority of web servers and there's also Android. Why do we need or want to get grannies to use it is beyond me.
From a user perspective I think they need one window manager where everything is always in the same place, one central repository for dependencies, and maintain them for at least 10 years. They also need to update less often, and release updates sort of like MS does in a service pack. Simply asking someone what version of Windows they are running gives a lot of info quickly. Why would a business install an OS that had less than about a 7 year support term? They want someone who thinks a computer is magic to use it for 10 years without it ever crashing. It needs a central authority for updates, yet maintain the ability to customize for those that want to. I think using osx on a PC sort of gives you that, along with the cheap price. Android might be able to pull it off if they keep control of it.
So, Mr. Curtis thinks we need to reinvent JVM? E.g. OpenJDK is available for al major distributions. Or am I just stupid and I don't get the depth this article?
What Linux needs is $1Billion in marketing. Thats all.
So, Mr. Curtis thinks we need to reinvent JVM? E.g. OpenJDK is available for al major distributions. Or am I just stupid and I don't get the depth this article?
I think what he tries to say, is that linux user program developers should start using OpenJDK/Mono/whatever, that gives productivity boost over C/C++ for majority of user applications, and not stick to C(++) most of the time. C++ would not be such a problem, if it had more de-facto or otherwise standardised libraries.
@dbranko
I'm fine with "whatever", as long as it has nothing to do with Microsoft. Mono will always be just good enough for people to get a taste of things. If you want the real thing (latest version, all features 100% ...), you are quietly invited to switch back to good old Windows. I don't think this helps Linux one bit.
As others have said the problem with "Linux" is no common program management methods. Every distro likes to do its own thing, this inflicts a level of clumsiness that is insurmountable to the average PC user. The fact that you have two completely different package installation camps is utterly madness, there should be absolutely no reason for .rpm and .deb to exist separate of each other. Then you have the "just compile it" method, asking an average user to compile a piece of software is an absolute nightmare.
Next is all the god damn subversioning going on. Its like every single required library needs to be updated to the latest in order for a program to work, but then another program needs an older version of that same library. Now the user must make a choice, they can only use one of those two programs because the user isn't savvy enough to tinker with installation dependencies and library installation paths. Installation of "drivers" is ... utterly painful. Windows just requires you to download and run an .exe setup file, Linux requires you to recode sh!t.
There ~needs~ to be a central authority / release control on common library's and dependency trees. At least in MS Windows the application can often choose to install the libraries it needs into its local folder and use them from there without needing to put them in the system folder. This results in every application getting its own set and solves the compatibility issue.
And the lack of GUI's for everything is unacceptable. It may feel really cool and geeky for tech's to do everything on a command line, but average use's ~NEVER~ want to see a command line. The entire reason MS Windows because popular is because it did everything it possibly could to hide the command line from the user. And what little interaction with the command line the user had to do was kept to a absolute minimum with simple easy to follow commands. Unix in general practically requires people to learn RegEx's and programming to utilize the command line.
I'm a Solaris 10 administrator, I practically live in bash. I don't mind it because its what I do, but I would never expect my parents or my gf to have to do what I do every day.
As others have said the problem with "Linux" is no common program management methods. Every distro likes to do its own thing, this inflicts a level of clumsiness that is insurmountable to the average PC user. The fact that you have two completely different package installation camps is utterly madness, there should be absolutely no reason for .rpm and .deb to exist separate of each other. Then you have the "just compile it" method, asking an average user to compile a piece of software is an absolute nightmare ...
When was the last time you checked a Linux distro? You have GUI for package management everywhere, no need to care about rpm etc. I'm using RHEL 6. I can't think of any missing GUIs in general. What do you think should be covered additionally, any examples?
As for compiling; why would an average user need to compile anything? All regular stuff is available in binary form.
Stop thinking "distro", last I checked MS Windows doesn't have a "distro". If I want to download / install a MS Windows program I go out and download the .exe or .msi and *bam* installed. If I want to download something for CentOS I get it in either .rpm (if I'm lucky) or source and I'm forced to compile / install it.
~Every~ distro's "software management" required the support of that distro, and if that distro doesn't have their own version of the software I want, then I'm stuck doing the compile / install dance. There is no universal installation / package management method, if I write a program then I'm forced to chose a distro in order to distribute it in binary form.