<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>fleurer &#187; parsec</title>
	<atom:link href="http://www.fleurer-lee.com/tag/parsec/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.fleurer-lee.com</link>
	<description>rage and love, story of my life.</description>
	<lastBuildDate>Thu, 02 Sep 2010 11:07:00 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>试玩了下parsec</title>
		<link>http://www.fleurer-lee.com/2009/04/26/%e8%af%95%e7%8e%a9%e4%ba%86%e4%b8%8bparsec/</link>
		<comments>http://www.fleurer-lee.com/2009/04/26/%e8%af%95%e7%8e%a9%e4%ba%86%e4%b8%8bparsec/#comments</comments>
		<pubDate>Sun, 26 Apr 2009 05:22:31 +0000</pubDate>
		<dc:creator>ssword</dc:creator>
				<category><![CDATA[备忘]]></category>
		<category><![CDATA[FP]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[parsec]]></category>

		<guid isPermaLink="false">http://swdpress.cn/?p=645132</guid>
		<description><![CDATA[上个星期翻的那篇文章里貌似提了下parsec，照着文档试玩了下。之前貌似玩过ruby的那个代码生成工具racc，它的名气不是很大而且文档不怎么全，感觉不是很爽。写了个helloworld就再也没碰。再写个parsec的helloworld。
语法分析的helloword就是表达式计算了。parsec跟*acc貌似不怎么像。不过无非都是上下文无关文法嘛，这个有名的ebnf：

expr     ::=   expr  ’+’   term  &#124;  term
term     ::=   term  ’*’   factor   &#124;  factor
factor   ::=   ’&#40;’  expr   ’&#41;’  &#124;  digit+
&#160;
digit    ::= [...]]]></description>
			<content:encoded><![CDATA[<p>上个星期翻的那篇文章里貌似提了下parsec，照着文档试玩了下。之前貌似玩过ruby的那个代码生成工具racc，它的名气不是很大而且文档不怎么全，感觉不是很爽。写了个helloworld就再也没碰。再写个parsec的helloworld。</p>
<p>语法分析的helloword就是表达式计算了。parsec跟*acc貌似不怎么像。不过无非都是上下文无关文法嘛，这个有名的ebnf：</p>

<div class="wp_syntax"><div class="code"><pre class="pascal" style="font-family:monospace;">expr     <span style="color: #339933;">::=</span>   expr  ’<span style="color: #339933;">+</span>’   term  |  term
term     <span style="color: #339933;">::=</span>   term  ’<span style="color: #339933;">*</span>’   factor   |  factor
factor   <span style="color: #339933;">::=</span>   ’<span style="color: #009900;">&#40;</span>’  expr   ’<span style="color: #009900;">&#41;</span>’  |  digit<span style="color: #339933;">+</span>
&nbsp;
digit    <span style="color: #339933;">::=</span>   ’<span style="color: #cc66cc;">0</span>’  |  ’<span style="color: #cc66cc;">1</span>’  |  ...  |  ’<span style="color: #cc66cc;">9</span>’</pre></div></div>

<p>葫芦画瓢地写出parsec版：</p>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">import</span> Text<span style="color: #339933; font-weight: bold;">.</span>ParserCombinators<span style="color: #339933; font-weight: bold;">.</span>Parsec
<span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">qualified</span> Text<span style="color: #339933; font-weight: bold;">.</span>ParserCombinators<span style="color: #339933; font-weight: bold;">.</span>Parsec<span style="color: #339933; font-weight: bold;">.</span>Token <span style="color: #06c; font-weight: bold;">as</span> P
<span style="color: #06c; font-weight: bold;">import</span> Text<span style="color: #339933; font-weight: bold;">.</span>ParserCombinators<span style="color: #339933; font-weight: bold;">.</span>Parsec<span style="color: #339933; font-weight: bold;">.</span>Language <span style="color: green;">&#40;</span>haskellDef<span style="color: green;">&#41;</span>
&nbsp;
do<span style="color: #339933; font-weight: bold;">_</span>parse p input
	<span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>parse p <span style="background-color: #3cb371;">&quot;&quot;</span> input<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
		Left err <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
			<span style="font-weight: bold;">putStr</span> <span style="background-color: #3cb371;">&quot;parse error at:&quot;</span>;
			<span style="font-weight: bold;">print</span> err;
		<span style="color: green;">&#125;</span>
		Right x <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
			<span style="font-weight: bold;">print</span> x;
		<span style="color: green;">&#125;</span>
&nbsp;
expr <span style="color: #339933; font-weight: bold;">::</span> Parser <span style="color: #cccc00; font-weight: bold;">Int</span>
expr <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
	a <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">-</span> expr;
	char '<span style="color: #339933; font-weight: bold;">+</span>';
	b <span style="color: #339933; font-weight: bold;">&lt;-</span> term;
	<span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span>a <span style="color: #339933; font-weight: bold;">+</span> b<span style="color: green;">&#41;</span>;
<span style="color: green;">&#125;</span> <span style="color: #339933; font-weight: bold;">&lt;|&gt;</span> term
&nbsp;
term <span style="color: #339933; font-weight: bold;">::</span> Parser <span style="color: #cccc00; font-weight: bold;">Int</span>
term <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
	a <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">-</span> term;
	char '<span style="color: #339933; font-weight: bold;">*</span>';
	b <span style="color: #339933; font-weight: bold;">&lt;-</span> factor;
	<span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span>a<span style="color: #339933; font-weight: bold;">*</span>b<span style="color: green;">&#41;</span>;
<span style="color: green;">&#125;</span>
&nbsp;
factor <span style="color: #339933; font-weight: bold;">::</span> Parser <span style="color: #cccc00; font-weight: bold;">Int</span>
factor <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
	char '<span style="color: green;">&#40;</span>';
	a <span style="color: #339933; font-weight: bold;">&lt;-</span> expr;
	char '<span style="color: green;">&#41;</span>';
	<span style="font-weight: bold;">return</span> a;
<span style="color: green;">&#125;</span> <span style="color: #339933; font-weight: bold;">&lt;|&gt;</span> number
&nbsp;
number <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
	a <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">-</span> many1 digit;
	<span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">read</span> a<span style="color: green;">&#41;</span>;
<span style="color: green;">&#125;</span></pre></div></div>

<p>parsec， Parser Combinators嘛。当然跟*acc那套不一样了。函数在这里就都是个组合子，可以像零件那样方便地组合。</p>
<p>do_parse这个函数取两个参数，一个是parser，一个是表示表达式的字符串。haskell的do-notation除了允许像python那样的缩进风格之外，还有这里用的c-style。parsec就用monad来表示sequence，< |>来表示choice，某种意义上讲，貌似只要它俩就足够构造出复杂的parser了。咋一看可能要比*acc那伪bnf的观感麻烦得多，但别忘了组合子的优势：可以方便地组合。用几个简单的组合子可以方便地构造出复杂的组合子，复杂的组合子继续组合成更复杂的组合子，而使用起来可是极为简洁。</p>
<p>嗯，废话真多。<br />
进ghci测试一下
</pre>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #339933; font-weight: bold;">*</span>Main<span style="color: #339933; font-weight: bold;">&gt;</span> do<span style="color: #339933; font-weight: bold;">_</span>parse expr <span style="background-color: #3cb371;">&quot;1+1&quot;</span>
<span style="color: #339933; font-weight: bold;">***</span> Exception: stack overflow</pre></div></div>

<p>查手册，发现这么一句“Unfortunately, left-recursive grammars can not be specified directly in a combinator library. If you accidently write a left recursive program, the parser will go into an infinite loop! ”</p>
<p>哑然。<br />
组合子还是函数嘛，想想在c中要是定义这样的函数会怎样。</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">Paser expr<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
	a <span style="color: #339933;">=</span> expr<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	...
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>不过“However, every left-recursive grammar can be rewritten to a non-left- recursive grammar.  The library provides combinators which do this automatically for you (chainl and chainl1). ”</p>
<p>嗯。手册里貌似写了个表达式计算的例子，就直接用了Parsec内置的ParsecExpr：</p>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">import</span>   ParsecExpr
&nbsp;
expr      <span style="color: #339933; font-weight: bold;">::</span>  Parser   <span style="color: #cccc00; font-weight: bold;">Integer</span>
expr      <span style="color: #339933; font-weight: bold;">=</span>  buildExpressionParser      table   factor
          <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">?&gt;</span>  <span style="background-color: #3cb371;">&quot;expression&quot;</span>
&nbsp;
table     <span style="color: #339933; font-weight: bold;">=</span>  <span style="color: green;">&#91;</span><span style="color: green;">&#91;</span>op  <span style="background-color: #3cb371;">&quot;*&quot;</span>  <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">*</span><span style="color: green;">&#41;</span>  AssocLeft <span style="color: #339933; font-weight: bold;">,</span>   op  <span style="background-color: #3cb371;">&quot;/&quot;</span>  <span style="font-weight: bold;">div</span>  AssocLeft<span style="color: green;">&#93;</span>
             <span style="color: #339933; font-weight: bold;">,</span><span style="color: green;">&#91;</span>op  <span style="background-color: #3cb371;">&quot;+&quot;</span>  <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span><span style="color: green;">&#41;</span>  AssocLeft<span style="color: #339933; font-weight: bold;">,</span>    op  <span style="background-color: #3cb371;">&quot;-&quot;</span>  <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">-</span><span style="color: green;">&#41;</span>  AssocLeft<span style="color: green;">&#93;</span>
             <span style="color: green;">&#93;</span>
          <span style="color: #06c; font-weight: bold;">where</span>
             op  s f  assoc
                <span style="color: #339933; font-weight: bold;">=</span>  Infix   <span style="color: green;">&#40;</span><span style="color: #06c; font-weight: bold;">do</span><span style="color: green;">&#123;</span>  string   s;  <span style="font-weight: bold;">return</span>  f<span style="color: green;">&#125;</span><span style="color: green;">&#41;</span>   assoc
&nbsp;
factor    <span style="color: #339933; font-weight: bold;">=</span>  <span style="color: #06c; font-weight: bold;">do</span><span style="color: green;">&#123;</span>  char  ’<span style="color: green;">&#40;</span>’
                ; x  <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">-</span> expr
                ; char  ’<span style="color: green;">&#41;</span>’
                ; <span style="font-weight: bold;">return</span>   x
               <span style="color: green;">&#125;</span>
          <span style="color: #339933; font-weight: bold;">&lt;|&gt;</span>  number
          <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">?&gt;</span>  <span style="background-color: #3cb371;">&quot;simple   expression&quot;</span>
&nbsp;
number    <span style="color: #339933; font-weight: bold;">::</span>  Parser   <span style="color: #cccc00; font-weight: bold;">Integer</span>
number    <span style="color: #339933; font-weight: bold;">=</span>  <span style="color: #06c; font-weight: bold;">do</span><span style="color: green;">&#123;</span>  ds <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">-</span>  many1   digit
                ; <span style="font-weight: bold;">return</span>   <span style="color: green;">&#40;</span><span style="font-weight: bold;">read</span>   ds<span style="color: green;">&#41;</span>
               <span style="color: green;">&#125;</span>
          <span style="color: #339933; font-weight: bold;">&lt;?&gt;</span>  <span style="background-color: #3cb371;">&quot;number&quot;</span></pre></div></div>

<p>呃，运算符的优先级，左结合都考虑了。不过这样也太没意思了，该我写的都让它给内置了，不过那句话不是提到有个chainl1么，就用chainl1重写：</p>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">module</span> Main <span style="color: #06c; font-weight: bold;">where</span>
&nbsp;
<span style="color: #06c; font-weight: bold;">import</span> Text<span style="color: #339933; font-weight: bold;">.</span>ParserCombinators<span style="color: #339933; font-weight: bold;">.</span>Parsec
<span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">qualified</span> Text<span style="color: #339933; font-weight: bold;">.</span>ParserCombinators<span style="color: #339933; font-weight: bold;">.</span>Parsec<span style="color: #339933; font-weight: bold;">.</span>Token <span style="color: #06c; font-weight: bold;">as</span> P
<span style="color: #06c; font-weight: bold;">import</span> Text<span style="color: #339933; font-weight: bold;">.</span>ParserCombinators<span style="color: #339933; font-weight: bold;">.</span>Parsec<span style="color: #339933; font-weight: bold;">.</span>Language <span style="color: green;">&#40;</span>haskellDef<span style="color: green;">&#41;</span>
&nbsp;
main <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
	input <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">-</span> <span style="font-weight: bold;">getLine</span>;
	do<span style="color: #339933; font-weight: bold;">_</span>parse expr input;
<span style="color: green;">&#125;</span>
&nbsp;
do<span style="color: #339933; font-weight: bold;">_</span>parse p input
	<span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>parse p <span style="background-color: #3cb371;">&quot;&quot;</span> input<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
		Left err <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
			<span style="font-weight: bold;">putStr</span> <span style="background-color: #3cb371;">&quot;parse error at:&quot;</span>;
			<span style="font-weight: bold;">print</span> err;
		<span style="color: green;">&#125;</span>
		Right x <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
			<span style="font-weight: bold;">print</span> x;
		<span style="color: green;">&#125;</span>
&nbsp;
lexer <span style="color: #339933; font-weight: bold;">=</span> P<span style="color: #339933; font-weight: bold;">.</span>makeTokenParser haskellDef
parens <span style="color: #339933; font-weight: bold;">=</span> P<span style="color: #339933; font-weight: bold;">.</span>parens lexer
symbol <span style="color: #339933; font-weight: bold;">=</span> P<span style="color: #339933; font-weight: bold;">.</span>symbol lexer
naturalOrFloat <span style="color: #339933; font-weight: bold;">=</span> P<span style="color: #339933; font-weight: bold;">.</span>naturalOrFloat lexer
&nbsp;
expr <span style="color: #339933; font-weight: bold;">=</span> term `chainl1` addop
term <span style="color: #339933; font-weight: bold;">=</span> factor `chainl1` mulop
factor <span style="color: #339933; font-weight: bold;">=</span> parens expr <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">|&gt;</span> number
number <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span>
	norf <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">-</span> naturalOrFloat;
   	<span style="color: #06c; font-weight: bold;">case</span> norf <span style="color: #06c; font-weight: bold;">of</span>
		Left n <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">read</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="font-weight: bold;">show</span> n<span style="color: green;">&#41;</span>;  <span style="color: #5d478b; font-style: italic;">--so sucks</span>
	   	Right f	<span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="font-weight: bold;">return</span> f;
<span style="color: green;">&#125;</span>
&nbsp;
mulop <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span> symbol <span style="background-color: #3cb371;">&quot;*&quot;</span> ; <span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">*</span><span style="color: green;">&#41;</span>; <span style="color: green;">&#125;</span> <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">|&gt;</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span> symbol <span style="background-color: #3cb371;">&quot;/&quot;</span> ; <span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">/</span><span style="color: green;">&#41;</span>; <span style="color: green;">&#125;</span>
addop <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span> symbol <span style="background-color: #3cb371;">&quot;+&quot;</span> ; <span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span><span style="color: green;">&#41;</span>; <span style="color: green;">&#125;</span> <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: #339933; font-weight: bold;">|&gt;</span> <span style="color: #06c; font-weight: bold;">do</span> <span style="color: green;">&#123;</span> symbol <span style="background-color: #3cb371;">&quot;-&quot;</span> ; <span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">-</span><span style="color: green;">&#41;</span>; <span style="color: green;">&#125;</span></pre></div></div>

<p>用到了ParsecToken。它就直接使用了haskell的token定义，也就是说，你可以在这个程序里使用haskell的注释</p>

<div class="wp_syntax"><div class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #339933; font-weight: bold;">*</span>Main<span style="color: #339933; font-weight: bold;">&gt;</span> do<span style="color: #339933; font-weight: bold;">_</span>parse expr <span style="background-color: #3cb371;">&quot;1+{-- it's a comment --}(1/2)&quot;</span>
<span style="color: red;">1.5</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://www.fleurer-lee.com/2009/04/26/%e8%af%95%e7%8e%a9%e4%ba%86%e4%b8%8bparsec/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
