[GCC-XML] Xrtti.h

Wesley Tansey tansey at vt.edu
Thu Apr 26 10:01:08 EDT 2007


This is a great discussion so far.  I'm actually also working on a 
similar tool to automate C++ serialization which uses GCCXML as its 
first step.  My approach is different in a few ways though:

1. It uses a visual tool.
2. It's intended for the high performance computing (HPC) community.
3. As a result of 2, the serialization is to MPI, not sockets.

We've encountered most of the issues you guys have discussed in the 
email, and I think with the visual tool we've solved all of them.  The 
tool allows the user to select a datatype from their program and then 
gives them a treeview of the fields in that datatype.  Then they just 
check off what they want to serialize and click generate.  The code then 
gets generated to mirror MPI's normal Send, Recv, etc., calls so that it 
looks like you're still using MPI calls but you're actually using the 
code generated by the tool.

If you guys are interested, we'll be submitting a paper on the complete 
tool to ASE 2007 in early June.  I can post a link to the paper and the 
tool's website when it's all ready.

Wesley Tansey
Graduate Student
Department of Computer Science
Virginia Tech
http://people.cs.vt.edu/~tansey/



Bryan Ischo wrote:
>> Our differences as far as I undersand.
>>
>> Your scheme,
>> 1. Define CPP_Interface through class definitions and using basic types.
>> 2. Use GCCXML to generate CPP_Interface_XML
>> 3. Use CPP_Interface_XML to generate serialization code.
>>     
>
> Yes, exactly.
>
>   
>> My scheme,
>> 1. Define Basic Types and User Defined types in XSD.
>> 2. a.  A parser would understand and generate the CPP or Python or Perl
>> classes for class declarations
>>     b. The parser would also generate Serialization methods which will be
>> say in binary form.
>>     
>
> Yes, I understand now what you are saying.  I realize that XSD is a
> standard document format and that you will have the advantages that go
> along with that.
>
>   
>> I understand with your dislike of XML for serialization output, but I see
>> no reason why C++ class
>> would win over XSD for interface specification.
>>     
>
> That is a good point; I have also thought of this - why not use a separate
> data type definition language (XSD based or other IDL type thing) and then
> generate the C++ code (or Java code, Python code, etc) instead?
>
> It's a difference between starting with C++ and getting serialization, or
> starting with XSD/IDL and getting serializable C++.
>
> I think of my approach as more like adding a missing feature to C++ - more
> comprehensive reflection (Xrtti), and serialization.  Just like Java has
> built-in serialization support, I had hoped to fill in the gaps for C++
> and add the same features to it, in a way that as nearly as possible
> seemed like a natural feature of the language.
>
> It's just a matter of different approaches; I can certainly see the value
> of  what you propose, because it is language-agnostic and can be used to
> generate bindings for any language at all.  On the other hand, to
> accomplish this it has to define a new language of its own (the XSD
> specification, IDL, whatever) that the developer has to manage.  I am
> going with my approach because it means that a C++ developer has nothing
> to do except write C++ classes as they normally would (with some caveats
> that I discuss below), and they get serialization nearly "for free" with
> Xrtti + XrttiSerial (or whatever I end up calling my serialization
> library).
>
> Probably the best of both worlds would be to have tools which:
>
> 1. Generate a common XSD format from C++ class definitions
> 2. Generate C++ class definitions from a common XSD format
> 3. Generate C++ serialization from the XSD format
>
> Then, the developer could choose to start with the XSD and generate their
> C++ classes (as in 1), or start with C++ classes and generate XSD (as in
> 2).  Either way, using (3), they end up with both C++ classes and the
> means for serializing them.  And other languages could be supported too.
>
> While I agree that this provides the most flexibility, and I will
> certainly consider altering my serialization project in the future to use
> an intermediate description language like your XSD, for the time being I
> am going to do the simpler thing of just always starting with C++ and
> using Xrtti and my serialization library to add serialization to any C++
> code.  It solves the problem I am trying to solve, in a way that requires
> minimal extra steps for the developer.
>
>   
>> Infact, in your scheme XML is indeed in one of the steps before feeding to
>> your code generator.
>>     
>
> Yes, sadly, this is true.  XML is, to me, such a pig of a document format,
> that it is very hard to work with.  In the free software world, it is very
> hard to find a reasonable XML parser.  As far as I can tell, expat is far
> away the most popular and widely used simple XML parser (that doesn't come
> attached with a huge framework of other software).  And expat is so very
> very kludgy in its API (in my opinion), it represents to me the problem
> with XML: it is so hard to write parsers for, that no one bothers to write
> good parsers for it.  XML parsers end up being just barely "good enough",
> with the exhausted developer who has written the XML parser left with no
> energy to make the parser really good, such as being able to parse XML
> documents in fragments instead of one big chunk.
>
> Anyway, gccxml generates XML (unfortunately) and my xrttigen program does
> use expat to parse the XML, and I am glad that expat exists despite my
> complaints because otherwise I'd have to write a parser for XML myself.
>
>   
>> I do have one trick problem with defining interface, how do you handle
>> conditional definitions?
>>
>> In example,
>> class response
>> {
>>         bool status;
>>          UserDefinedTypeSuccessRecord record;
>> }
>> If we want to define UserDefinedTypeSuccessRecord is valid or serialized
>> only if status is set to true.
>>     
>
> Unfortunately C++ has some real problems when it comes to serialization;
> the language is a little too free-form to allow just any old C++ classes
> to be serialized without programmer intervention.  In cases like you have
> mentioned there are two options, as far as I can tell:
>
> 1. Provide mechanisms for the developer to put custom code in to guide
> serialization in places where it cannot be automatically deduced from the
> class structure (Boost and s11n and others work this way, requiring
> *every* class to have custom serialization code written by hand by the
> developer).
>
> 2. Require the developer to follow certain conventions when defining their
> C++ classes so that the serializer can always do the right thing.
>
> For my purposes, I will be using (2).  This means that for example if a
> developer defines a class with an array:
>
> class response
> {
>       int *idArray;
> };
>
> Then the developer will have to follow a specific convention for declaring
> how many elements are in the idArray at the time that serialization is
> done on the class; in my case, I will require that there be a member
> called "idArray_count" which gives the number of elements.  So the
> developer would be required to do:
>
> class response
> {
>       int *idArray;
>       int idArray_count;
> };
>
> The serializer would give an error and refuse to serialize any class which
> doesn't do this.
>
> This means that not every C++ class can be serialized, but the
> requirements for supporting serialization are really pretty minimal and
> not hard to implement.
>
> I wish it were possible to automatically serialize every class but
> unfortunately without developer support (option 1 that I outlined above)
> this is not possible.  I would rather not complicate the developer's life
> by making them write code to custom serialize classes; I think it is
> cleaner and neater to simply require a few conventions that developers
> must follow when defining serializable classes.
>
> To bring this back to your example above, I would probably say that the
> developer should instead define their class like this:
>
> class response
> {
>       bool status;
>       UserDefinedTypeSuccessRecord record[1];
>       int record_count;
> };
>
> And the developer would have to make sure that record_count was 1 whenever
> record was to be used.  The developer could even drop status in favor of
> just record_count, which would be 1 on success and 0 on error, and serve
> both purposes of indicating status, and indicating how many valid entries
> are in the record array (zero or one).
>
>   
>> We can continue the conversation in private if we are boring the rest of
>> the group :-) My id is shiva at qualcomm.com
>>     
>
> This list gets very little traffic, and I don't think it's annoying too
> many people.  If anyone wants us to go private, please speak up.  I will
> not be offended :)
>
> Thank you, and best wishes,
> Bryan
>
> ------------------------------------------------------------------------
> Bryan Ischo                bryan at ischo.com            2001 Mazda 626 GLX
> Hamilton, New Zealand      http://www.ischo.com     RedHat Fedora Core 5
> ------------------------------------------------------------------------
>
>
> _______________________________________________
> gccxml mailing list
> gccxml at gccxml.org
> http://www.gccxml.org/mailman/listinfo/gccxml
>
>   




More information about the gccxml mailing list