[GCC-XML]gcc2xml and macros

James Michael DuPont mdupont777 at yahoo.com
Mon May 6 13:25:53 EDT 2002


Sebastien, Brad and other list members,

Let me answer some of your questions about the
introspector, a related project to the gcc_xml that I
have been working on for the past some time on my free
time.

Note that what follows is a long post about the
history and development of the introspector, so please
ignore it if you are not interested, but I also
attempt to answer sebastiens questions as well.

>> - What is the link with
http://introspector.sourceforge.net/, if any ?
The introspector and gcc-xml projects are similar, but
not exactly the same.

Originally I was working for a company Innovative
software GmbH that was creating a C++ reverse
engineering program, it could display class tree,
relationships, modify and generate code. 
Like rational rose to together++.

I had always thought about why they did not use the
gcc for parsing, it was much better. 
The company had stopped working on and selling the
product, and there were no free software tools to do
the exact same. (Maybe argo-uml for java, but not for
c and c++)
At the time I did not understand the tree structures
well enough to be able to create such an interface. 

Four years into my new job,
I started on a different track with the beginnings of
the introspector project, in 1999 to create a OO C++
interface into the compiler tree structures.

Brad had been posting some ideas about using XML for
dumping in the summer of 2000 about using XML for
output.
His main intention (Correct me if I am wrong brad )
was to get at the function and type declarations for
creating better wrapper libraries for gcc.

My intention is to create a better user API into the
gcc compiler.
The toolkits of kitware benifit greatly from having
properly typed language wrappers, better than swig.

A long time had passed, and I was busy working on my
c++ interface and building in some form of debugging
output to understand the tree structures.

At the OOP 2000 Conference in Munich, I was inspired
by to use XML and liked the idea of the intentional
programming that was being research by microsoft at
the time.

Afterwards I decided to put my efforts into a XML
exchange format. No easy task.

After fighting with my non understanding of the
compilers tree structures and preparation for my talk
about the introspector at the YAPC::Europe, I stumbled
over the cppxml from the university of waterloo, 
they modified the tree-dumper function of the gcc
compiler and used some transformations on the trees.

I took the tree dumper and modified it to output xml.
that was the basis for my xml interface into the gcc.

First I wanted to use DSSL to process the XML, 
and started to learn scheme. The problem was that I
could not interact with the dssl enough to learn it, I
like some form of interactive tool to learn a
language.
XSLT was better documented than DSSSL, and I got some
reports running in it. 

XSLT is not a good language for processing ASTs
because they are highly networked, you need 
to be able to go in any direction, and XSLT is not
efficient at that.

Prolog was my next choice, but the memory limitations
of the gnu-prolog stopped me from doing more that
30000 nodes at onces.

After using xslt to translate the structures into
prolog in order to query them, I discovered the main
structure of the tree structures. 

This understanding led me to translate the xml into
perl programs, that when run would construct the tree
structures in memory without using an XML Parser.

I had abandoned my c++ interface for a perl interface
and started to write reports and code generators in
perl. 

Afterwards I removed the XSLT and replaced it with a
XML parser. 

Later I put the perl program in a pipe from the
compiler, so that you can pipe the results from the
compiler directly into perl.

I have built an Postgres and Mysql repository of the
tree structures and have written hundreds of queries
that have shown me how the tree nodes work properly.

Now in the past 3 months, I have been in contact with
the FSF about getting support for my project. 

Currently we are in discussions to prevent the abuse
of the XML and external data by third partys.

The issue is to prevent the usage of the gcc 
by non-free software in such a way that they can
piggyback on top of it. 

The introspector gives such a security hole in its
current incarnation and has to be changed.

My intent is to remove *ALL* forms
of XML, SQL and external data until an updated version
GPL of the GPL will come out to prevent the usage of
that data in non-free software.

I would be very careful if you intend on using the GCC
internal data in any form of non-free, non-gpl
software.  You might be opening yourself up too a
violation of the GPL. 

Especially if you have to redistribute the gcc and its
only purpose of that redistribution and modification
is to extract that data into a non-free software.

Currently I am working on linking GCC directly to
perl.
Then all the needed data can be extracted directly
from the compiler.

My intent is also to link in a GUI directly into the
whole thing, and remove any data files that need to be
exchanged. 

Sebastien,
As to transformations on code that you are interested
in doing, I think that the best and safest bet would
be to create a compiler extension to do that, you can
then have your own code that will do tree
transformations to do that inside of the compiler.
See the Ast-optimizer branch for that.

In the future, the introspector might make it easier
to do such a thing, but you will need to be willing 
make your modifications GPLed for me to help you out.

>>- is there a xml -> gcc front-end already built ?
No, and I think that it is not wished for, that would
make it too easy for abuse.

>> - is it possible to introduce some "new macros"
>>constructions in C/C++ that gccxml will leave
>>untouched and that could be interpreted on an XML
>>processor ?
I think that you would be best off by declaring a
special type, let's say FunkyPointer<T> that you can
used to declare extra information in your program.

Or if you want to add attributes, a template would be
good.
HelpName<T,"STRING"> can be extracted easily from
inside the compiler. 
So if you have float, HelpName<float,"A float is a
floating point number">
that would add extra attribute to your program,
the xml of gcc_xml could be used to then scanned for
such instances and they could be extracted and
processed.

>>As XML is real close to Lisp/Scheme way of thinking
>>(list processing) and that Lisp/Scheme macros are
>>much more powerful (expressive), I think this
>>approach can be interesting.
That is similar to the XSLT work i was doing.

Anyway, I hope that this all presents some insight
into the purpose of the introspector and its
relationship to the gcc_xml.
And again, I can only hope that you can for yourselves
resolve the licensing issues with the FSF. 
I know that brad had written to them, but I would
consider also warning your users that the licensing is
an issue, and tell them to consider GPLing all code
that uses the AST information even indirectly from the
GCC.

Just for the record, I see the gcc_xml project and
cabel as being valuable tools. 

I don't think that they are abusing the GPL directly, 
but my opinion does not matter much.

You have to make sure that your users understand the
possibilities of a GPL violation.
Mike

=====
James Michael DuPont

__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com



More information about the gccxml mailing list