RSS

Abstract Syntax Tree for your own applications

04 Oct

This article shows how you can use the Abstract Syntax Tree for your own applications using ECLIPSE. This article is written by

By Thomas Kuhn, Eye Media GmbH
Olivier Thomann, IBM Ottawa Lab
Copyright ©2006 Thomas Kuhn, Olivier Thomann. Made available under the EPL v1.0
November 20, 2006

Summary

The Abstract Syntax Tree is the base framework for many powerful tools of the Eclipse IDE, including refactoring, Quick Fix and Quick Assist. The Abstract Syntax Tree maps plain Java source code in a tree form. This tree is more convenient and reliable to analyse and modify programmatically than text-based source.

Introduction

Are you wondering how Eclipse is doing all the magic like jumping conveniently to a declaration, when you press “F3” on a reference to a field or method? Or how “Replace in file” solidly detects the declaration and all the references to the local variable and modifies them synchronously?

Well, these—and a big portion of the other source code modification and generation tools—are based upon the Abstract Syntax Tree (AST). The AST is comparable to the DOM tree model of an XML file. Just like with DOM, the AST allows you to modify the tree model and reflects these modifications in the Java source code.

This article refers to an example application which covers most of the interesting AST-related topics. Let us have a look at the application that was built to
illustrate this article:

Example Application

According to Java Practices [4], you should not declare local variables before using them. The goal of our application will be to detect contradicting variable declarations and to move them to their correct place. There are three cases our application has to deal with:

  1. Removal of unnecessary declaration. If a variable is declared and initialized, only to be overridden by another assignment later on, the first declaration of the variable is an unnecessary declaration.
  2. Move of declaration. If a variable is declared, and not immediately referenced within the following statement, this variable declaration has to be moved. The correct place for the declaration is the line before it is first referenced.
  3. Move of declaration of a variable that is referred to from within different blocks. This is a subcase of case 2. Imagine that a variable is used in both a try- and a catch clause. Here the declaration cannot be moved right before the first reference in the try-clause, since then it would not be declared in the catch-clause. Our application has to deal with that and has to move the declaration to the best possible place, which would be here one line above the try-clause.

In Appendix A, Code Fragments for Example Application Cases code snippets to each of these cases are provided.

You can import the example application into your workspace [1] or install the plug-in using the Eclipse Update Manager [2].

Workflow

A typical workflow of an application using AST looks like this:

Figure 1. AST Workflow


  1. Java source: To start off, you provide some source code to parse. This source code can be supplied as a Java file in your project or directly as a char[] that contains Java source
  2. Parse: The source code described at 1 is parsed. All you need for this step is provided by the class org.eclipse.jdt.core.dom.ASTParser. See the section called “Parsing source code”.
  3. The Abstract Syntax Tree is the result of step 2. It is a tree model that entirely represents the source you provided in step 1. If requested, the parser also computes and includes additional symbol resolved information called “bindings“.
  4. Manipulating the AST: If the AST of point 3 needs to be changed, this can be done in two ways:
    1. By directly modifying the AST.
    2. By noting the modifications in a separate protocol. This protocol is handled by an instance of ASTRewrite.

    See more in the section called “How to Apply Changes”.

  5. Writing changes back: If changes have been made, they need to be applied to the source code that was provided by 1.This is described in detail in the section called “Write it down”.
  6. IDocument: Is a wrapper for the source code of step 1 and is needed at point 5

The Abstract Syntax Tree (AST)

As mentioned, the Abstract Syntax Tree is the way that Eclipse looks at your source code: every Java source file is entirely represented as tree of AST nodes. These nodes are all subclasses of ASTNode. Every subclass is specialized for an element of the Java Programming Language. E.g. there are nodes for method declarations ( MethodDeclaration), variable declaration ( VariableDeclarationFragment), assignments and so on. One very frequently used node is SimpleName. A SimpleName is any string of Java source that is not a keyword, a Boolean literal ( true or false) or the null literal.

For example, in i = 6 + j;, i and j are represented by SimpleNames. In import net.sourceforge.earticleast, net sourceforge and
earticleast are mapped to SimpleNames.

All AST-relevant classes are located in the package org.eclipse.jdt.core.dom of the org.eclipse.jdt.core plug-in.

To discover how code is represented as AST, the AST Viewer plug-in [5] is a big help: Once installed you can simply mark source code in the editor and let it be displayed in a tree form in the AST Viewer view.

Parsing source code

Most of the time, an AST is not created from scratch, but rather parsed from existing Java code. This is done using the ASTParser. It processes whole Java files as well as portions of Java code. In the example application the method parse(ICompilationUnit unit) of the class AbstractASTArticle parses the source code stored in the file that unit points to:

protected CompilationUnit parse(ICompilationUnit unit) {
	ASTParser parser = ASTParser.newParser(AST.JLS3);
	parser.setKind(ASTParser.K_COMPILATION_UNIT);
	parser.setSource(unit); // set source
	parser.setResolveBindings(true); // we need bindings later on
	return (CompilationUnit) parser
.createAST(null /* IProgressMonitor */); // parse
}

With ASTParser.newParser(AST.JLS3), we advise the parser to parse the code following to the Java Language Specification, Third Edition. JLS3 includes all Java Language Specifications up to the new syntax introduced in Java 5. With the update of Eclipse towards JLS3, changes have been made to the AST API. To preserve compatibility, the ASTParser can be run in the deprecated JLS2 mode.

parser.setKind(ASTParser.K_COMPILATION_UNIT) tells the parser, that it has to expect an ICompilationUnit as input. An ICompilationUnit is a pointer to a Java file. The parser supports five kinds of input:

Entire source file: The parser expects the source either as a pointer to a Java file (which means as an ICompilationUnit, see the section called “Java Model”) or as
char[].

  • K_COMPILATION_UNIT

Portion of Java code: The parser processes a portion of code. The input format is char[].

  • K_EXPRESSION: the input contains a Java expression. E.g. new String(), 4+6 or i.
  • K_STATEMENTS: the input contains a Java statement like new String(); or synchronized (this) { ... }.
  • K_CLASS_BODY_DECLARATIONS: the input contains elements of a Java class like method declarations, field declarations, static blocks, etc.

Java Model

The Java Model is a whole different story. It is out of scope of this article to dive deep into its details within. The parts looked at will be the ones which intersect with the AST. The motivation to discuss it here is, to use it as an entry point to build an Abstract Syntax Tree of a source file. Remember, the ICompilationUnit is one of the possible parameters for the AST parser.

The Java Model represents a Java Project in a tree structure, which is visualized by the well known “Package Explorer” view:

Figure 2. Java Model Overview


The nodes of the Java Model implement one of the following interfaces:

  • IJavaProject: Is the node of the Java Model and represents a Java Project. It contains IPackageFragmentRoots as child nodes.
  • IPackageFragmentRoot: can be a source or a class folder of a project, a .zip or a .jar file. IPackageFragmentRoot can hold source or binary
    files.
  • IPackageFragment: A single package. It contains  ICompilationUnits or IClassFiles, depending on whether the IPackageFragmentRoot is of type source or of type binary. Note that IPackageFragment are not organized as parent-children. E.g. net.sf.a is not the parent of net.sf.a.b. They are two independent children of the same IPackageFragmentRoot.
  • ICompilationUnit: a Java source file.
  • IImportDeclaration,  IType, IField, IInitializer, IMethod: children of ICompilationUnit. The information provided by these nodes is available from the AST, too.

In contrast to the AST, these nodes are lightweight handles. It costs much less to rebuild a portion of the Java Model than to rebuild an AST. That is also one reason why the Java Model is not only defined down to the level of ICompilationUnit. There are many cases where complete information, like that provided by the AST, is not needed.

If you want to see more info…. click here

ECLIPE CORNER ARTICLE – Abstract Syntax Tree

Advertisements
 
2 Comments

Posted by on October 4, 2008 in GWT/ JSNI / COMPILER

 

Tags: , ,

2 responses to “Abstract Syntax Tree for your own applications

  1. Sagar Shinde

    August 7, 2011 at 3:30 am

    Hi, as per the article creating abstract syntax tree, modifying it works. But to reflect the chages of modified ast, back to original source code file, is not working.

    If you refer “Write it down” section of the article it says to declare
    ITextFileBufferManager bufferManager = FileBuffers.getTextFileBufferManager(); // get the buffer manager

    This declaration throws following exception in my project:

    Exception in thread “main” java.lang.ExceptionInInitializerError
    at ASTModifier.main(ASTModifier.java:205)
    Caused by: java.lang.IllegalStateException: Workspace is closed.
    at org.eclipse.core.resources.ResourcesPlugin.getWorkspace(ResourcesPlugin.java:340)
    at org.eclipse.core.filebuffers.FileBuffers.(FileBuffers.java:52)
    … 1 more

    What is the reason of this error? If you want, I can send you my source code .

    Kindly help me.
    thanks

    regards
    sagar

     

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: