Skip to main content

Island Grammar-Based Parsing Using GLL and Tom

  • Conference paper
Software Language Engineering (SLE 2012)

Abstract

Extending a language by embedding within it another language presents significant parsing challenges, especially if the embedding is recursive. The composite grammar is likely to be nondeterministic as a result of tokens that are valid in both the host and the embedded language. In this paper we examine the challenges of embedding the Tom language into a variety of general-purpose high level languages. Tom provides syntax and semantics for advanced pattern matching and tree rewriting facilities. Embedded Tom constructs are translated into the host language by a preprocessor, the output of which is a composite program written purely in the host language. Tom implementations exist for Java, C, C#, Python and Caml. The current parser is complex and difficult to maintain. In this paper, we describe how Tom can be parsed using island grammars implemented with the Generalised LL (GLL) parsing algorithm. The grammar is, as might be expected, ambiguous. Extracting the correct derivation relies on our disambiguation strategy which is based on pattern matching within the parse forest. We describe different classes of ambiguity and propose patterns for resolving them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Moreau, P.-E., Ringeissen, C., Vittek, M.: A Pattern Matching Compiler for Multiple Target Languages. In: Hedin, G. (ed.) CC 2003. LNCS, vol. 2622, pp. 61–76. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  2. Bravenboer, M., Dolstra, E., Visser, E.: Preventing injection attacks with syntax embeddings. Science of Computer Programming 75(7), 473–495 (2010)

    Article  MATH  Google Scholar 

  3. Moonen, L.: Generating robust parsers using island grammars. In: Proceedings of the 8th Working Conference on Reverse Engineering, pp. 13–22. IEEE (2001)

    Google Scholar 

  4. Tomita, M.: Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer Academic Publishers, Norwell (1985)

    Google Scholar 

  5. Rekers, J.: Parser Generation for Interactive Environments. PhD thesis, University of Amsterdam, The Netherlands (1992), http://homepages.cwi.nl/~paulk/dissertations/Rekers.pdf

  6. Visser, E.: Scannerless generalized-LR parsing. Technical Report P9707, Programming Research Group, University of Amsterdam (1997)

    Google Scholar 

  7. Scott, E., Johnstone, A.: GLL parse-tree generation. Science of Computer Programming (to appear, 2012)

    Google Scholar 

  8. Manders, M.W.: mlBNF - a syntax formalism for domain specific languages. Master’s thesis, Eindhoven University of Technology, The Netherlands (2011), http://alexandria.tue.nl/extra1/afstversl/wsk-i/manders2011.pdf

  9. Balland, E., Brauner, P., Kopetz, R., Moreau, P.-E., Reilles, A.: Tom: Piggybacking Rewriting on Java. In: Baader, F. (ed.) RTA 2007. LNCS, vol. 4533, pp. 36–47. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Balland, E., Kirchner, C., Moreau, P.-E.: Formal Islands. In: Johnson, M., Vene, V. (eds.) AMAST 2006. LNCS, vol. 4019, pp. 51–65. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Johnstone, A., Scott, E.: Modelling GLL Parser Implementations. In: Malloy, B., Staab, S., van den Brand, M. (eds.) SLE 2010. LNCS, vol. 6563, pp. 42–61. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Heering, J., Hendriks, P.R.H., Klint, P., Rekers, J.: The syntax definition formalism SDF-reference manual-. SIGPLAN Not. 24(11), 43–75 (1989)

    Article  Google Scholar 

  13. van Deursen, A., Kuipers, T.: Building documentation generators. In: Proceedings of the IEEE International Conference on Software Maintenance, pp. 40–49 (1999)

    Google Scholar 

  14. Bravenboer, M., Visser, E.: Concrete syntax for objects: domain-specific language embedding and assimilation without restrictions. SIGPLAN Not. 39(10), 365–383 (2004)

    Article  Google Scholar 

  15. den van Brand, M.G.J., Scheerder, J., Vinju, J.J., Visser, E.: Disambiguation Filters for Scannerless Generalized LR Parsers. In: Nigel Horspool, R. (ed.) CC 2002. LNCS, vol. 2304, pp. 143–158. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Post, E.: Island grammars in ASF+SDF. Master’s thesis, University of Amsterdam, The Netherlands (2007), http://homepages.cwi.nl/~paulk/theses/ErikPost.pdf

  17. van der Leek, R.: Implementation Strategies for Island Grammars. Master’s thesis, Delft University of Technology, The Netherlands (2005), http://swerl.tudelft.nl/twiki/pub/Main/RobVanDerLeek/robvanderleek.pdf

  18. van den Brand, M.G.J., Klusener, S., Moonen, L., Vinju, J.J.: Generalized parsing and term rewriting: Semantics driven disambiguation. Electr. Notes Theor. Comput. Sci. 82(3), 575–591 (2003)

    Article  Google Scholar 

  19. Synytskyy, N., Cordy, J.R., Dean, T.R.: Robust multilingual parsing using island grammars. In: Proceedings of the 2003 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON 2003, pp. 266–278. IBM Press (2003)

    Google Scholar 

  20. Cordy, J.R.: TXL - A Language for Programming Language Tools and Applications. Electronic Notes in Theoretical Computer Science 110, 3–31 (2004)

    Article  Google Scholar 

  21. Schwerdfeger, A.C., Van Wyk, E.R.: Verifiable composition of deterministic grammars. SIGPLAN Not. 44(6), 199–210 (2009)

    Article  Google Scholar 

  22. Aycock, J., Nigel Horspool, R.: Schrödinger’s Token. Software, Practice & Experience 31, 803–814 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Afroozeh, A. et al. (2013). Island Grammar-Based Parsing Using GLL and Tom. In: Czarnecki, K., Hedin, G. (eds) Software Language Engineering. SLE 2012. Lecture Notes in Computer Science, vol 7745. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36089-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36089-3_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36088-6

  • Online ISBN: 978-3-642-36089-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics