[plug] Using YACC
Buddrige, David
BuddrigeD at logica.com
Fri Aug 2 17:45:39 WST 2002
Hi all,
This may be moderately off-topic (it is actually a linux problem - albeit a
programming problem 8-) ); I am teaching myself to use LEX and YACC in
order to solve a problem we have here. We have a bunch of source-code with
in-line comments all through it. We want to re-arrange those comments so
that we can use the DOC++ tool to generate documentation. The comments as
they exist have been written to a company standard so they are reasonably
consistent, however they are not how DOC++ wants to see them. Basically,
rather than manually going through the 1-2 million lines of code we have
here and re-arranging the comments for DOC++, we would like to get a program
to do that for us....
To do this, I have begun writing a very simple grammer using lex and yacc.
I have never written a full grammer before - I have only ever hacked someone
else's existing working code, and so I am still a learner when it comes to
the nitty-gritty details. Anyway here is my problem.
I have written this grammer to recognise comments in the code (either C or
C++ style), and to simply print them out to the stdout (for now) (see code
at the end of this email):
What I am trying to do is to define a "file" as containing 0 or more lines.
However, while it is fairly stragihtforward (using extended regular
expressions) in lex to define a token that has repeating parts, when you
want to define a grammatical component that has repeating components; well,
I haven't yet figured out how to do it.
My "file" rule in parser.y shown below, needs to consist of 0 or more lines.
At present, when reading an input stream, it reads the first line in the
input stream perfectly, but subsequently it causes a syntax error (because a
file is defined as having only one line at this point).
Does anyone know how to define a grammatical element that has (potentially)
infinitely repeating parts - ie. possible to have an infinite number of
lines to a file...
thanks heaps guys
David Buddrige
parser.y:
%{
#include <stdio.h>
#include <string.h>
void yyerror(const char *str)
{
fprintf(stderr, "error: %s\n",str);
}
main()
{
yyparse();
}
%}
%token C_COM_START C_COM_END CPP_COM TEXT EOLN
%%
file: line { printf("line!\n"); }
;
line: EOLN { printf("blank line!\n"); }
| TEXT c_comment EOLN { printf("c_comment!\n"); }
| TEXT cpp_comment { printf("cpp_comment!\n"); }
| TEXT EOLN { printf("ordinary line!\n"); }
;
cpp_comment: CPP_COM TEXT EOLN
c_comment: C_COM_START TEXT C_COM_END
%%
This code uses the following lex built lexer as its input:
scanner.l:
/*
** INCLUDE FILES
*/
%{
#include <stdio.h>
/*Note that y.tab.h is generated by using yacc with the -d option*/
#include "y.tab.h"
%}
C_COM_START \/\*
C_COM_END \*\/
CPP_COM \/\/
TEXT [a-zA-Z0-9]
EOLN \n
WHITESPACE [\t\ ]
%%
{C_COM_START} {
return C_COM_START;
}
{C_COM_END} {
return C_COM_END;
}
{CPP_COM} {
return CPP_COM;
}
{TEXT}* {
return TEXT;
}
{EOLN} {
return EOLN;
}
{WHITESPACE} { /* Ignore Whitespace */
}
%%
int yywrap()
{
return 1;
}
/*
int main(int argc, char **argv)
{
printf(yytext);
yylex();
}
*/
This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
More information about the plug
mailing list