The GNU Compiler Collection (GCC) was, from its inception, written in C and compiled by a C compiler. Beginning in 2008, an effort was undertaken to change GCC so that it could be compiled by a C++ compiler and take advantage of a subset of C++ constructs. This effort was jump-started by a presentation by Ian Lance Taylor [PDF] at the June 2008 GCC summit. As with any major change, this one had its naysayers and its problems, as well as its proponents and successes.
Reasons
Taylor's slides list the reasons to commit to writing GCC in C++:
- C++ is well-known and popular.
- It's nearly a superset of C90, which GCC was then written in.
- The C subset of C++ is as efficient as C.
- C++ "supports cleaner code in several significant cases." It never requires "uglier" code.
- C++ makes it harder to break interface boundaries, which leads to cleaner interfaces.
The popularity of C++ and its superset relationship to C speak for themselves. In stating that the C subset of C++ is as efficient as C, Taylor meant that if developers are concerned about efficiency, limiting themselves to C constructs will generate code that is just as efficient. Having cleaner interfaces is one of the main advantages of C++, or any object-oriented language. Saying that C++ never requires "uglier" code is a value judgment. However, saying that it supports "cleaner code in several significant cases" has a deep history, best demonstrated by gengtype.
What had happened was that developers were emulating features such as garbage collection, a vector class, and a tree class in C. This was the "ugly" code to which Taylor referred.
In his slides, Taylor also tried to address many of the initial objections: that C++ was slow, that it was complicated, that there would be a bootstrap problem, and that the Free Software Foundation (FSF) wouldn't like it. He addressed the speed issue by pointing out that the C subset of C++ is as efficient as C. As far as FSF went, Taylor wrote, "The FSF is not writing the code."
The complexity of a language is in the eye of the beholder. Many GCC developers were primarily, or exclusively, C programmers, so of necessity there would be a time period in which they would be less productive, and/or might use C++ in ways that negated all its purported benefits. To combat that problem, Taylor hoped to develop coding standards that limited development to a subset of C++.
The bootstrap problem could be resolved by ensuring that GCC version N-1 could always build GCC version N, and that they could link statically against libstdc++. GCC version N-1 must be linked against libstdc++ N-1 while it is building GCC N and libstdc++ N; GCC N, in turn, will need libstdc++ N. Static linking ensures that each version of the compiler runs with the appropriate version of the library.
For many years prior to 2008, there had been general agreement to restrict GCC code to a common subset of C and C++, according to Taylor (via email). However, there was a great deal of resistance to replacing the C compiler with a C++ compiler. At the 2008 GCC summit, Taylor took a poll on how large that resistance was, and approximately 40% were opposed. The C++ boosters paid close attention to identifying and addressing the specific objections raised by C++ opponents (speed, memory usage, inexperience of developers, and so on), so that each year thereafter the size of the opposition shrank significantly. Most of these discussions took place at the GCC summits and via unlogged IRC chats. Therefore, the only available record is in the GCC mailing list archives.
First steps
The first step, a proper baby step, was merely to try to compile the existing C code base with a C++ compiler. While Taylor was still at the conference, he created a gcc-in-cxx branch for experimenting with building GCC with a C++ compiler. Developers were quick to announce their intention to work on the project. The initial build attempts encountered many errors and warnings, which were then cleaned up.
In June 2009, almost exactly a year from proposing this switch, Taylor reported that phase one was complete. He configured GCC with the switch enable-build-with-cxx to cause the core compiler to be built with C++. A bootstrap on a single target system was completed. Around this time, the separate cxx branch was merged into the main GCC trunk, and people continued their work, using the enable-build-with-cxx switch. (However, the separate branch was revived on at least one occasion for experimentation.)
In May 2010, there was a GCC Release Manager Q&A on IRC. The conclusion from that meeting was to request permission from the GCC Steering Committee to use C++ language features in GCC itself, as opposed to just compiling with a C++ compiler. Permission was granted, with agreement also coming from the FSF. Mark Mitchell announced the decision in an email to the GCC mailing list on May 31, 2010.
In that thread, Jakub Jelinek and Vladimir Makarov expressed a lack of enthusiasm for the change. However, as Makarov put it, he had no desire to start a flame war over a decision that had already been made. That said, he recently shared via email that his primary concern was that the GCC community would rush into converting the GCC code base to C++ "instead of working on more important things for GCC users (like improving performance, new functionality and so on). Fortunately, it did not happen."
Richard Guenther was concerned about creating a tree class hierarchy:
The efforts of the proponents to allay concerns, and the "please be careful" messages from the opponents give some indication of the other concerns. In addition to the issues raised by Taylor at the 2008 presentation, Jelinek mentioned memory usage. Others, often as asides to other comments, worried that novice C++ programmers would use the language inappropriately, and create unmaintainable code.
There was much discussion about coding standards in the thread. Several argued for existing standards, but others pointed out that they needed to define a "safe" subset of C++ to use. There was, at first, little agreement about which features of C++ were safe for a novice C++ developer. Taylor proposed a set of coding standards. These were amended by Lawrence Crowl and others, and then were adopted. Every requirement has a thorough rationale and discussion attached. However, the guiding principle on maintainability is not the coding standard, but one that always existed for GCC: the maintainer of a component makes the final decision about any changes to that component.
Current status
Currently, those who supported the changes feel their efforts provided the benefits they expected. No one has publicly expressed any dissatisfaction with the effort. Makarov was relieved that his fear that the conversion effort would be a drain on resources did not come to pass. In addition, he cites the benefits of improved modularity as being a way to make GCC easier to learn, and thus more likely to attract new developers.
As far as speed goes, Makarov noted that a bootstrap on a multi-CPU platform is as fast as it was for C. However, on uniprocessor platforms, a C bootstrap was 30% faster. He did not speculate as to why that is. He also found positive impacts, like converting to C++ hash tables, which sped up compile time by 1-2%. This last work is an ongoing process, that Lawrence Crowl last reported on in October 2012. In keeping with Makarov's concerns, this work is done slowly, as people's time and interests permit.
Of the initial desired conversions (gengtype, tree, and vector), vector support is provided using C++ constructs (i.e., a class) and gengtype has been rewritten for C++ compatibility. Trees are a different matter. Although they have been much discussed and volunteered for several times, no change has been made to the code. This adds credence to the 2010 contention of Guenther (who has changed his surname to Biener) that it would be difficult to do correctly. Reached recently, Biener stated that he felt it was too early to assess the impact of the conversion because, compared to the size of GCC, there have been few changes to C++ constructs. On the negative side, he noted (as others have) that, because of the changes, long-time contributors must relearn things that they were familiar with in the past.
In 2008, 2009, and 2010, (i.e., at the beginning and after each milestone) Taylor provided formal plans for the next steps. There is no formal plan going forward from here. People will use C++ constructs in future patches as they deem necessary, but not just for the sake of doing so. Some will limit their changes to the times when they are patching the code anyway. Others approach the existing C code with an eye to converting code to C++ wherever it makes the code clearer or more efficient. Therefore, this is an ongoing effort on a meandering path for the foreseeable future.
As the C++ project has progressed, some fears have been allayed, while some developers are still in a holding pattern. For them it is too soon to evaluate things definitively, and too late to change course. However, the majority seems to be pleased with the changes. Only time will tell what new benefits or problems will arise.
Index entries for this article | |
---|---|
GuestArticles | Jacobson, Linda |
(Log in to post comments)
GCC's move to C++
Posted Mar 14, 2013 4:24 UTC (Thu) by jhhaller (subscriber, #56103) [Link]
GCC's move to C++
Posted Mar 14, 2013 16:30 UTC (Thu) by etienne (guest, #25256) [Link]
I am not a fan of openat(), but maybe it can be used for GCC.
For server/daemon, you can get to this situation:
Let's assume you want to write a safe TFTP server, only able to upload/download stuff in /tftpboot - so you write your server and it changes directory to /tftpboot before serving requests, only using openat().
Now the user of TFTP has a complex configuration, and keeps /tftpboot-v1 and /tftpboot-v2, and link or rename /tftpboot to what is needed at that time.
If you used openat() in tftpd, your user is then forced to restart the tftpd server when he changes configuration, else he get the content of the directory *when the server was started*...
GCC's move to C++
Posted Mar 14, 2013 9:57 UTC (Thu) by ncm (subscriber, #165) [Link]
GCC's move to C++
Posted Mar 14, 2013 16:40 UTC (Thu) by dashesy (guest, #74652) [Link]
GCC's move to C++
Posted Mar 14, 2013 16:57 UTC (Thu) by rriggs (guest, #11598) [Link]
As to using C++11, the problem is that GCC's implementation is still experimental. It makes it more difficult to ensure that GCC N can be built with both GCC N-1 and N.
*Google provides a reasonable rationale for their standard, but many do not read the rationale or understand it and just blindly accept that it is a reasonable coding standard, even when it does not apply to their situation.
Compiler nannyism
Posted Mar 14, 2013 21:52 UTC (Thu) by ncm (subscriber, #165) [Link]
The typical response is, why not just declare the destructor virtual? But we do not always have a choice about what goes into header files we get from others. A bug has been open on this for a long time: an easy fix would be to complain only when the destructor is called in a polymorphic context.
Another rule (echoed in Google's much sillier standard) forbids any use of exceptions. That can be defensible in a program that started out as C.
Compiler nannyism
Posted Mar 14, 2013 22:17 UTC (Thu) by dashesy (guest, #74652) [Link]
What most of competent programmers overlook is that, the next person may not have all the skills, so it should be as simple and straightforward as possible.
As for using exceptions it is more of a policy I think, either all code should be exception-aware or not. Then probably the person who initially put together that standard did not like them (maybe they take too much of the code real-state and look ugly, maybe she thinks they are just glorified goto statements, maybe exceptions should never cross the shared library for any reason).
Compiler nannyism
Posted Mar 14, 2013 22:43 UTC (Thu) by jwakely (guest, #60262) [Link]
That's pretty much what the -Wdelete-non-virtual-dtor warning does, which I added to GCC, "borrowing" it from Clang
Compiler nannyism
Posted Mar 21, 2013 17:44 UTC (Thu) by xman (subscriber, #46972) [Link]
While the case while you inherit a header is very real, one could argue about that problem with a LOT of warnings. I think it is reasonable to have it wired in to -W with an option to disable it on a per header basis (in general, for warnings, you want to investigate and potentially disable them for a particular case... that's the difference between an error and a warning).
GCC's move to C++
Posted Mar 14, 2013 17:10 UTC (Thu) by jwakely (guest, #60262) [Link]
Currently GCC trunk can be built by any GCC since 3.4, and if you're willing to do a bit of hacking with even older versions. Requiring a C++11 compiler would mean GCC 4.6 or later, or an equivalently new Clang, as before then the language was still in flux and the rules kept changing. You'd force people to bootstrap GCC 4.6 or 4.7 just to then bootstrap 4.8, even if they have an old C++ compiler already installed, which would be quite inconvenient.
For GNU/Linux users it wouldn't be a problem, they'd just get it from their distro, who deal with building it, but it doesn't help anyone to make it awkward for an admin of a ten year old Solaris, HPUX or AIX server to install a new GCC. If we want people to use the GNU compiler instead of their system compiler then the barrier for entry can't be too high.
It wasn't so long ago that GCC could still be bootstrapped starting with a K&R C compiler!
Bootstrapping GCC with K&R C
Posted Mar 14, 2013 21:32 UTC (Thu) by ncm (subscriber, #165) [Link]
This would make a good contest.
GCC's move to C++
Posted Mar 15, 2013 4:26 UTC (Fri) by brianomahoney (guest, #6206) [Link]
GCC's move to C++
Posted Mar 27, 2013 3:33 UTC (Wed) by eean (guest, #50420) [Link]
The coding standard they established for GCC looks highly specific to the legacy codebase they are building on. It's not useful as a general guideline.
GCC's move to C++
Posted Mar 15, 2013 6:36 UTC (Fri) by eru (subscriber, #2753) [Link]
I doubt it is useful to require a "safe" or "efficient" subset of C++ and expect that desire to have much effect, unless you have a picky tool for enforcing it. In a large project programmers will have different notions of what is safe, and will use their favourite features anyway. For example, multiple inheritance essentially crept in by accident into the first large C++ project I was involved with, although we also wanted to use "safe" features and all C++ compiler did not even support multiple inheritance at the time. - On the other hand, the black-belt hackers who work on GCC can probably be trusted to not shoot themselves in the foot with C++ (unlike most programmers, who should steer clear of it).