Sven,

Great! Thanks for sharing that example.

I have to admit that I was shocked to see that the Hollerith Format
(variable length tokens) was actually used in a real input file with a
standard behind it. They did that because at the time this format was
invented, not all Fortran compilers had character arrays.

Jay

On Tue, Jun 21, 2016 at 9:41 AM, Sven Efftinge ([email protected]) <
[email protected]> wrote:

> Hi Jay, hi Kasper,
>
> Not sure what other things are hidden in the details of that format, but
> you will need to use a custom lexer, to support the variable length tokens
> (i.e. '11HHello World'). A starting point for doing this might be
> https://github.com/TypeFox/xtext-jflex
> Note that this just replaces the generator, but for your format you should
> hand write one and bind it into Xtext.
>
> Sven
>
> 2016-06-21 3:14 GMT+02:00 Jay Jay Billings <[email protected]>:
>
>> Sven,
>>
>> I just saw your response in the archives (I only now joined xtext-dev).
>> Yes, our goal is to have a parser and editor based on an Xtext grammar.
>> We're trying to get away from manually written parsers.
>>
>> Jay
>>
>> On Mon, Jun 20, 2016 at 2:36 PM, kaspergam <[email protected]>
>> wrote:
>>
>>> I am an intern working on the Eclipse Advanced Visualization Project
>>> (EAVP), and we are using Xtext to parse files and import them into our data
>>> structures. Using Xtext, I was able to fairly easily accomplish this task
>>> with STL files STL (file format) - Wikipedia, the free encyclopedia
>>> <https://en.wikipedia.org/wiki/STL_(file_format)>. This is because STL
>>> has keywords followed by data, which is a simple Xtext grammar to write.
>>>
>>>
>>> [image: image] <https://en.wikipedia.org/wiki/STL_(file_format)>
>>>
>>>
>>>
>>>
>>>
>>> STL (file format) - Wikipedia, the free encyclopedia
>>> <https://en.wikipedia.org/wiki/STL_(file_format)>
>>> STL (STereoLithography) is a file format native to the stereolithography
>>> CAD software created by 3D Systems.[1][2][3] STL has several after-the-fact
>>> backronyms such...
>>> View on en.wikipedia.org
>>> <https://en.wikipedia.org/wiki/STL_(file_format)>
>>> Preview by Yahoo
>>>
>>> However, we are now focusing on IGES (Initial Graphics Exchange
>>> Specification) geometry files, which are not as easy to parse. The IGES
>>> files are ASCII records with 80 characters per line. These lines need to be
>>> split by either length (in characters) or by a delimiter. It seems
>>> difficult to get Xtext to read in files line by line or even character by
>>> character. More information on IGES can be found here: IGES -
>>> Wikipedia, the free encyclopedia <https://en.wikipedia.org/wiki/IGES>
>>>
>>>
>>> [image: image] <https://en.wikipedia.org/wiki/IGES>
>>>
>>>
>>>
>>>
>>>
>>> IGES - Wikipedia, the free encyclopedia
>>> <https://en.wikipedia.org/wiki/IGES>
>>> The Initial Graphics Exchange Specification (IGES) (pronounced eye-jess)
>>> is a vendor-neutral file format that allows the digital exchange of
>>> information among compu...
>>> View on en.wikipedia.org <https://en.wikipedia.org/wiki/IGES>
>>> Preview by Yahoo
>>>
>>> The main difficulty is deciding the type of line the parser is currently
>>> reading. There are sections to the file, and the current section is
>>> specified by the 73rd character per line. I cannot find a way to easily
>>> read this character before starting to parse the rest of the line. For
>>> example, here is the file on the Wikipedia page:
>>>
>>>                                                                         S   
>>>    1
>>> 1H,,1H;,4HSLOT,37H$1$DUA2:[IGESLIB.BDRAFT.B2I]SLOT.IGS;,                G   
>>>    1
>>> 17HBravo3 BravoDRAFT,31HBravo3->IGES V3.002 (02-Oct-87),32,38,6,38,15,  G   
>>>    2
>>> 4HSLOT,1.,1,4HINCH,8,0.08,13H871006.192927,1.E-06,6.,                   G   
>>>    3
>>> 31HD. A. Harrod, Tel. 313/995-6333,24HAPPLICON - Ann Arbor, MI,4,0;     G   
>>>    4
>>>      116       1       0       1       0       0       0       0       1D   
>>>    1
>>>      116       1       5       1       0                               0D   
>>>    2
>>>      116       2       0       1       0       0       0       0       1D   
>>>    3
>>>      116       1       5       1       0                               0D   
>>>    4
>>>      100       3       0       1       0       0       0       0       1D   
>>>    5
>>>      100       1       2       1       0                               0D   
>>>    6
>>>      100       4       0       1       0       0       0       0       1D   
>>>    7
>>>      100       1       2       1       0                               0D   
>>>    8
>>>      110       5       0       1       0       0       0       0       1D   
>>>    9
>>>      110       1       3       1       0                               0D   
>>>   10
>>>      110       6       0       1       0       0       0       0       1D   
>>>   11
>>>      110       1       3       1       0                               0D   
>>>   12
>>> 116,0.,0.,0.,0,0,0;                                                    1P   
>>>    1
>>> 116,5.,0.,0.,0,0,0;                                                    3P   
>>>    2
>>> 100,0.,0.,0.,0.,1.,0.,-1.,0,0;                                         5P   
>>>    3
>>> 100,0.,5.,0.,5.,-1.,5.,1.,0,0;                                         7P   
>>>    4
>>> 110,0.,-1.,0.,5.,-1.,0.,0,0;                                           9P   
>>>    5
>>> 110,0.,1.,0.,5.,1.,0.,0,0;                                            11P   
>>>    6
>>> S      1G      4D     12P      6                                        T   
>>>    1
>>>
>>> Here I need to skip the S and G rows, and after that read each number from 
>>> the D rows into fields- every 8 characters holds a record (after removing 
>>> white-space). The P rows need to be read in as comma separated values, 
>>> where the last bit (#P       #) is ignored. This is only a primitive parse 
>>> of this file, but still good enough to read in the critical data. Do you 
>>> think Xtext will be helpful or even usable to parse this kind of file? What 
>>> kind of grammar could we write to handle these problems? Any help would be 
>>> appreciated!
>>>
>>> Thank you so much,
>>> Kasper Gammeltoft
>>> Oak Ridge National Laboratory,
>>> Computer Science and Mathematics Division,
>>> Computer Science Research Group
>>> (865)-696-6625
>>> [email protected]
>>>
>>>
>>> _______________________________________________
>>> eavp-dev mailing list
>>> [email protected]
>>> To change your delivery options, retrieve your password, or unsubscribe
>>> from this list, visit
>>> https://dev.eclipse.org/mailman/listinfo/eavp-dev
>>>
>>>
>>
>>
>> --
>> Jay Jay Billings
>> Oak Ridge National Laboratory
>> Twitter Handle: @jayjaybillings
>>
>> _______________________________________________
>> xtext-dev mailing list
>> [email protected]
>> To change your delivery options, retrieve your password, or unsubscribe
>> from this list, visit
>> https://dev.eclipse.org/mailman/listinfo/xtext-dev
>>
>
>
> _______________________________________________
> xtext-dev mailing list
> [email protected]
> To change your delivery options, retrieve your password, or unsubscribe
> from this list, visit
> https://dev.eclipse.org/mailman/listinfo/xtext-dev
>



-- 
Jay Jay Billings
Oak Ridge National Laboratory
Twitter Handle: @jayjaybillings
_______________________________________________
xtext-dev mailing list
[email protected]
To change your delivery options, retrieve your password, or unsubscribe from 
this list, visit
https://dev.eclipse.org/mailman/listinfo/xtext-dev

Reply via email to