浏览代码

More explanation on the README

Getty Ritter 9 年之前
父节点
当前提交
04fa58a2ac
共有 1 个文件被更改,包括 119 次插入1 次删除
  1. 119 1
      README.md

+ 119 - 1
README.md

@@ -25,6 +25,12 @@ type parameters:
 someSpec :: SExprSpec atom carrier
 ~~~~
 
+Various functions will be provided that modify the carrier type (i.e. the
+output type of parsing or input type of serialization) or the language
+recognized by the parsing. Examples will be shown below.
+
+## Representing S-expressions
+
 There are three built-in representations of S-expression lists: two of them
 are isomorphic, as one or the other might be better for processing
 S-expression data, and the third represents only a subset of possible
@@ -32,9 +38,121 @@ S-expressions.
 
 ~~~~
 -- cons-based representation
-data SExpr atom = SAtom atom | SCons (SExpr atom) (SExpr atom) | SNil
+data SExpr atom
+  = SCons (SExpr atom) (SExpr atom)
+  | SNil
+  | SAtom atom
 
 -- list-based representation
 data RichSExpr atom
   = RSList [RichSExpr atom]
+  | RSDotList [RichSExpr atom] atom
+  | RSAtom atom
+
+-- well-formed representation
+data WellFormedSExpr atom
+  = WFSList [WellFormedSExpr atom]
+  | WFSAtom atom
+~~~~
+
+In the above, an `RSList [a, b, c]` and a
+`WFList [a, b, c]` both correspond to the structure
+`SCons a (SCons b (SCons d SNil))`, which corresponds to an
+S-expression which can be written as
+`(a b c)` or as `(a . (b . (c . ())))`. A `RSDotList`
+corresponds to an sequence of conses that does not terminate
+in an empty list, e.g. `RSDotList [a, b] c` corresponds to
+`SCons a (SCons b (SAtom c))`, which in turn corresponds to
+a structure like `(a b . c)` or `(a . (b . c))`.
+
+Functions for converting back and forth between
+representations are provided, but you can also modify a
+`SExprSpec` to parse to or serialize from a particular
+representation using the `asRich` and `asWellFormed`
+functions.
+
+~~~~
+*Data.SCargot.General> decode spec "(a b c)"
+Right [SCons (SAtom "a") (SCons (SAtom "b") (SCons (SAtom "c") SNil))]
+*Data.SCargot.General> decode (asRich spec) "(a b c)"
+Right [RSList [RSAtom "a",RSAtom "b",RSAtom "c"]]
+*Data.SCargot.General> decode (asWellFormed spec) "(a b c)"
+Right [WFSList [WFSAtom "a",WFSAtom "b",WFSAtom "c"]]
+*Data.SCargot.General> decode spec "(a . b)"
+Right [SCons (SAtom "a") (SAtom "b")]
+*Data.SCargot.General> decode (asRich spec) "(a . b)"
+Right [RSDotted [RSAtom "a"] "b"]
+*Data.SCargot.General> decode (asWellFormed spec) "(a . b)"
+Left "Found atom in cdr position"
+~~~~
+
+# Comments
+
+By default, an S-expression spec does not include a comment syntax, but
+the provided `withSemicolonComments` function will cause it to understand
+traditional Lisp line-oriented comments that begin with a semicolon:
+
+~~~~
+*Data.SCargot.General> decode spec "(this ; has a comment\n inside)\n"
+Left "Failed reading: takeWhile1"
+*Data.SCargot.General> decode (withSemicolonComments spec) "(this ; has a comment\n inside)\n"
+Right [SCons (SAtom "this") (SCons (SAtom "inside") SNil)]
+~~~~
+
+Additionally, you can provide your own comment syntax in the form of an
+AttoParsec parser. Any AttoParsec parser can be used, so long as it meets
+the following criteria:
+- it is capable of failing (as is called until SCargot believes that there
+are no more comments)
+- it does not consume any input in the case of failure, which may involve
+wrapping the parser in a call to `try`
+
+For example, the following adds C++-style comments to an S-expression format:
+
+~~~~
+*Data.SCargot.General> let cppComment = string "//" >> takeWhile (/= '\n') >> return ()
+*Data.SCargot.General> decode (setComment cppComment spec) "(a //comment\n  b)\n"
+Right [SCons (SAtom "a") (SCons (SAtom "b") SNil)]
+~~~~
+
+# Reader Macros
+
+A _reader macro_ is a Lisp macro which is invoked during read time. This
+allows the _lexical_ syntax of a Lisp to be modified. The most commonly
+seen reader macro is the quote, which allows the syntax `'expr` to stand
+in for the s-expression `(quote expr)`. The S-Cargot library enables this
+by keeping a map of characters to AttoParsec parsers that can be used as
+readers. There is a special case for the aforementioned quote, but that
+could easily be written by hand as
+
+~~~~
+*Data.SCargot.General> let mySpec = addReader '\'' (fmap go) spec
+                         where go c = SCons (SAtom "quote") (SCons c SNil)
+*Data.SCargot.General> decode (asRich mySpec) "(1 2 '(3 4))"
+Right [RSList [RSAtom "1",RSAtom "2",RSList [RSAtom "quote",RSList [RSAtom "3",RSAtom "4"]]]]
+~~~~
+
+A reader macro is passed the parser that invoked it, so that it can
+perform recursive calls, and can return any `SExpr` it would like. It
+may also take as much or as little of the remaining parse stream as it
+would like; for example, the following reader macro does not bother
+parsing anything else and merely returns a new token:
+
+~~~~
+*Data.SCargot.General> decode (addReader '?' (const (pure (SAtom "huh"))) mySpec) "(?1 2)"
+Right [SCons (SAtom "huh") (SCons (SAtom "1") (SCons (SAtom "2") SNil))]
+~~~~
+
+Reader macros in S-Cargot can be used to define common bits of Lisp
+syntax that are not typically considered the purview of S-expression
+parsers. For example, to allow square brackets as a subsitute for
+proper lists, we could define a reader macro that is initialized by the
+`[` character and repeatedly calls the parser until a `]` character
+is reached:
+
+~~~~
+*Data.SCargot.General> let pVec p = (char ']' *> pure SNil) <|> (SCons <$> p <*> pVec p)
+*Data.SCargot.General> let vec = addReader '[' pVec
+*Data.SCargot.General> decode (asRich (vec mySpec)) "(1 [2 3])"
+Right [RSList [RSAtom "1",RSList [RSAtom "2",RSAtom "3"]]]
 ~~~~