programing

ANTLR에서 '의미 적 술어'는 무엇입니까?

nasanasas 2020. 8. 19. 08:15
반응형

ANTLR에서 '의미 적 술어'는 무엇입니까?


ANTLR에서 의미 론적 술어 는 무엇입니까 ?


ANTLR 4

ANTLR 4의 조건 자의 경우 다음 스택 오버플로 Q & A를 확인하십시오.


ANTLR 3

의미 술어는 일반 코드를 사용하여 문법 조치에 따라 별도의 (의미) 규칙을 적용하는 방법입니다.

의미 론적 술어에는 세 가지 유형이 있습니다.

  • 의미 론적 술어 검증 ;
  • 게이트 된 의미 론적 술어;
  • 의미 론적 술어를 명확하게합니다 .

문법 예

공백을 무시하고 쉼표로 구분 된 숫자로만 구성된 텍스트 블록이 있다고 가정 해 보겠습니다. 이 입력을 구문 분석하여 숫자가 최대 3 자리 "긴"(최대 999)인지 확인하려고합니다. 다음 문법 ( Numbers.g)은 이러한 작업을 수행합니다.

grammar Numbers;

// entry point of this parser: it parses an input string consisting of at least 
// one number, optionally followed by zero or more comma's and numbers
parse
  :  number (',' number)* EOF
  ;

// matches a number that is between 1 and 3 digits long
number
  :  Digit Digit Digit
  |  Digit Digit
  |  Digit
  ;

// matches a single digit
Digit
  :  '0'..'9'
  ;

// ignore spaces
WhiteSpace
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

테스팅

문법은 다음 클래스로 테스트 할 수 있습니다.

import org.antlr.runtime.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("123, 456, 7   , 89");
        NumbersLexer lexer = new NumbersLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        NumbersParser parser = new NumbersParser(tokens);
        parser.parse();
    }
}

렉서와 파서를 생성하고 모든 .java파일을 컴파일 하고 Main클래스를 실행하여 테스트합니다 .

java -cp antlr-3.2.jar org.antlr.Tool Numbers.g
javac -cp antlr-3.2.jar * .java
java -cp. : antlr-3.2.jar 기본

이렇게하면 콘솔에 아무 것도 인쇄되지 않아 아무 문제도 발생하지 않았 음을 나타냅니다. 변경 시도 :

ANTLRStringStream in = new ANTLRStringStream("123, 456, 7   , 89");

으로:

ANTLRStringStream in = new ANTLRStringStream("123, 456, 7777   , 89");

테스트를 다시 수행하십시오. 콘솔에서 문자열 바로 뒤에 오류가 표시됩니다 777.


시맨틱 술어

This brings us to the semantic predicates. Let's say you want to parse numbers between 1 and 10 digits long. A rule like:

number
  :  Digit Digit Digit Digit Digit Digit Digit Digit Digit Digit
  |  Digit Digit Digit Digit Digit Digit Digit Digit Digit
     /* ... */
  |  Digit Digit Digit
  |  Digit Digit
  |  Digit
  ;

would become cumbersome. Semantic predicates can help simplify this type of rule.


1. Validating Semantic Predicates

A validating semantic predicate is nothing more than a block of code followed by a question mark:

RULE { /* a boolean expression in here */ }?

To solve the problem above using a validating semantic predicate, change the number rule in the grammar into:

number
@init { int N = 0; }
  :  (Digit { N++; } )+ { N <= 10 }?
  ;

The parts { int N = 0; } and { N++; } are plain Java statements of which the first is initialized when the parser "enters" the number rule. The actual predicate is: { N <= 10 }?, which causes the parser to throw a FailedPredicateException whenever a number is more than 10 digits long.

Test it by using the following ANTLRStringStream:

// all equal or less than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,1234567890"); 

which produces no exception, while the following does thow an exception:

// '12345678901' is more than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,12345678901");

2. Gated Semantic Predicates

A gated semantic predicate is similar to a validating semantic predicate, only the gated version produces a syntax error instead of a FailedPredicateException.

The syntax of a gated semantic predicate is:

{ /* a boolean expression in here */ }?=> RULE

To instead solve the above problem using gated predicates to match numbers up to 10 digits long you would write:

number
@init { int N = 1; }
  :  ( { N <= 10 }?=> Digit { N++; } )+
  ;

Test it again with both:

// all equal or less than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,1234567890"); 

and:

// '12345678901' is more than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,12345678901");

and you will see the last on will throw an error.


3. Disambiguating Semantic Predicates

The final type of predicate is a disambiguating semantic predicate, which looks a bit like a validating predicate ({boolean-expression}?), but acts more like a gated semantic predicate (no exception is thrown when the boolean expression evaluates to false). You can use it at the start of a rule to check some property of a rule and let the parser match said rule or not.

Let's say the example grammar creates Number tokens (a lexer rule instead of a parser rule) that will match numbers in the range of 0..999. Now in the parser, you'd like to make a distinction between low- and hight numbers (low: 0..500, high: 501..999). This could be done using a disambiguating semantic predicate where you inspect the token next in the stream (input.LT(1)) to check if it's either low or high.

A demo:

grammar Numbers;

parse
  :  atom (',' atom)* EOF
  ;

atom
  :  low  {System.out.println("low  = " + $low.text);}
  |  high {System.out.println("high = " + $high.text);}
  ;

low
  :  {Integer.valueOf(input.LT(1).getText()) <= 500}? Number
  ;

high
  :  Number
  ;

Number
  :  Digit Digit Digit
  |  Digit Digit
  |  Digit
  ;

fragment Digit
  :  '0'..'9'
  ;

WhiteSpace
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

If you now parse the string "123, 999, 456, 700, 89, 0", you'd see the following output:

low  = 123
high = 999
low  = 456
high = 700
low  = 89
low  = 0

I've always used the terse reference to ANTLR predicates on wincent.com as my guide.

참고URL : https://stackoverflow.com/questions/3056441/what-is-a-semantic-predicate-in-antlr

반응형