[GCC-XML]extra tags needed

Brad King brad.king at kitware.com
Tue Feb 12 16:31:00 EST 2002

> What is the reason for the rewrite?  What features are planned?  Let
> me know what you need help on.  I checked out the gcc-xml source from
> CVS and it seemed pretty sound (minus the stuff that I am lacking of
> course :) ).
The primary reason is the ridiculous size of the current format.  The
first reaction of anyone who I've talked to about using it is "wow, that
file is big".  The project that funded it involves wrapping C++ classes in
Tcl, and I had setup a build that took 4.5 gigs, about 4 of which were
just xml dumps to get the class information.

Most of the size comes from repeated type information.  The first re-write
currently in CVS improves on this quite a bit, with a 75% decrease in size
by removing most of the duplicate information.  However, I was hoping for
about a 99% decrease in size.

I read through examples of the new output, and realized that the main
bloat comes from information that isn't really needed in the output.  
Most of the time, GCC-XML will be used to get information about a specific
subset of the declarations in the translation unit.  No one needs the
standard library's internal declarations to get the interface to their
classes.  By allowing specification of starting points (class, function,
or namespace), and dumping the connected subgraph, most of the extra junk
will be left out while still providing full information for the user.  If
the user really does want everything, they can specify the global
namespace as a starting point.

One other reason for the rewrite is to make the format easier to parse by
not requiring the parser to do all the name lookups.  Currently the parser
needs to know the C++ name scoping rules to do proper bindings.  Since GCC
has already done this, the output format should include this information.  
Then the parser can just build a table of the available elements and
access them as needed based on their id.


More information about the gccxml mailing list