summaryrefslogtreecommitdiff
path: root/src/parser.c
AgeCommit message (Collapse)Author
2023-05-03parser: Variable assignment allocates their own nodeCarlos Maniero
This commit makes variable assignment parser to allocate memory for the node. It also moves the node initialization to the ast.c to follow our standard for node initialization. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Variable declaration allocates their own nodeCarlos Maniero
Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Split block into small functionsCarlos Maniero
Since it is possible to look a future token without consuming it, it was possible to split the block parser into small chunks of code. There is the performance drawback, because now the parser makes multiple lookups to the same token. However IMO that it is not a big concern given the small computation required to get a token. Also it can be easily addressed by computing all token in advance. Memory Leak: During the refactor I found some extra memory leaks related to not released scopes. So then, more than just printing a message I introduced an assert on scope.c to make sure developers will get this feedback asap because our testing framework suppress messages from stderr when the test passes. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Use lookahead instead of consuming tokensCarlos Maniero
Previously, during block declaration, the parser consumed the token which caused some parsers (such as return and variable declaration) to not be self-contained and to depend on the callee to start the parser. In this commit, I've refactored the parser to only look for future tokens using lookahead, and delegate the consumption to child parser functions. This results in a more modular and self-contained parser that improves the overall maintainability and readability of the code. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Refactor return statement to return an ast_nodeCarlos Maniero
During the refactoring process, I identified a memory leak where the return argument was allocated but not freed in case of an error. It also introduces the concept of keyword tokens. Where return is now a keyword simplifying the parser. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03Parser: Make the parser function return the ast_nodeCarlos Maniero
In many situations, the parser is responsible for reserving memory for nodes, particularly during function body parsing. This commit introduces a new standard where parser functions not only allocate memory for ast_nodes, but also return them. In case of a parser error, a NULL pointer is returned. This standard will be extended to other parsers in future commits, ensuring consistency throughout the codebase. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-01parser: Implement variable assignmentJohnny Richard
This commit introduces variable assignment making it possible to change a variable value. Example: myvar: i32 = 1; myvar = 2; Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlos@maniero.me>
2023-05-01parser: Use peek and drop token when parsing expressionsJohnny Richard
2023-04-30style: Invert parameters order on parser_parse_typeJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30gas: Compile variable expression with scope supportJohnny Richard
This patch adds the variable compilation and uses a scope (a stack of map) to lookup for identities. Today we use a vector + ref_entry structs in order to achieve the scope implementation. The ref_entry lacks memory management, we are still no sure who will be the owner of the pointer. We also want to replace the scope a hashtable_t type as soon as we get one. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-30polish: Remove unnecessary token creation when dropping tokenJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30ast: Rename variable and variable_declaration correctlyJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30parser: Registry identifiers on scopeJohnny Richard
We are parsing variables/functions and checking if they are defined on scope. Otherwise we fail the parsing with a nice message. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-30style: Add -Wmissing-declarations to CC CFLAGSJohnny Richard
The refactoring also replace a if statement by switch statement. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-26ast: Include a Binary Operation kind enumCarlos Maniero
The AST was using a string view to distinguish the operation kind. An enum was created for this purpose simplifying code generation. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-26lexer: Split operation tokens into their own tokenCarlos Maniero
The +, -, *, and / tokens used to be TOKEN_OP, but the TOKEN_OP has been removed and a token for each operation has been introduced. Python's token names were followed: https://docs.python.org/3/library/token.html Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-25style: Use clang-format as formatter and linter toolJohnny Richard
We want to keep the code style consistent, this first commit adds a .clang-format in order to "document" our style code. This patch also adds a target *linter* to Makefile which will complain if we have any style issue on test and src dirs. I have run the follow command to create the .clang-format file: $ clang-format -style=mozilla -dump-config > .clang-format And I also made some adjusts to .clang-format changing the following properties: PointerAlignment: Right ColumnLimit: 120 Commands executed to fix the current styling: $ find . -name *.h | xargs clang-format -i $ find . -name *.c | xargs clang-format -i Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-25parser: Add support for variables and identifiers in function bodyCarlos Maniero
This commit adds support for variables and identifiers in the function body of the parser, stored as a vector. However, at this point, identifier resolution is not fully implemented, and we currently accept identifiers without checking if they can be resolved. This is a known limitation that will be addressed in a future commit once hash-tables are added to the parser. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21ast: Create an init function for ast_binary_operation_tCarlos Maniero
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Co-authored-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21parser: Parse integers arithmetic expressionJohnny Richard
This patch implements the AST creation for arithmetic expressions. NOTE: The implementation works only for integer numbers. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Reviewed-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-20parser: Create the literal node typeCarlos Maniero
Since we want to extend our code to support multiple kind of expression it does not make sense that the return statement always return a number. For now on, return statement has an ast_node_t as argument, meaning that it could be anything. The literal_node_t was also implemented in order to keep the application behavior. Following the C's calling convention the literal values are stored at %eax and the return takes this argument to do anything it is needed. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20parser: Stop exiting on parser errorCarlos Maniero
Previously, when an error occurred during parsing, the application would exit, making it difficult to test the parser and limiting the compiler's extensibility. This commit improves the parser's error handling by allowing for continued execution after an error, enabling easier testing and increased flexibility. The parser is prepared to handle multiples errors, although the current implementation always returns a single error, it may be useful given multiples functions where we can show errors by context. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviwed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20ast: Allows recursive nodesCarlos Maniero
Previously, the abstract syntax tree (AST) used static types, meaning that an ast_function_t would always have a ast_return_stmt_t as its body. However, this assumption is not always true, as we may have void functions that do not have a return statement. Additionally, the ast_return_stmt_t always had a number associated with it, but this too is not always the case. To make this possible, I need to perform a few changes in the whole project. One of the main changes is that there is no longer the inheritance hack. That mechanism was replaced by composition and pointers where required for recursive type reference. It is important to mention that I decided to use union type to implement the composition. There is two main advantages in this approach: 1. There is only one function to allocate memory for all kind of nodes. 2. There is no need to cast the data. In summary, this commit introduces changes to support dynamic typing in the AST, by replacing the inheritance hack with composition and using union types to simplify memory allocation and type casting. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18ast: Create AST visitor to traverse the treeJohnny Richard
In the future we want to have the possibility of traverse the tree and pretty print it or generate binary for other platform like LLVM or transpile to C. This solution also implements the gas assembly x86_64 Linux code generation by using the visitor interface. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16Start using string_view on lexer and parserJohnny Richard
This change fixes the memory leak when token got created. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15parser: Show filepath row and col when parsing failsJohnny Richard
In order to find out where a parsing error occurred, this patch introduces the exactly location following the format 'file:row:col'. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15parser: Create parser for function with return statementsJohnny Richard
This is a very limited parser implementation which parses a single function with return type i32 and body containing a return number statement. The parser doesn't show the 'filepath:row:col' when it fails, a future improvement would be display it to easy find where the compilation problem is located. The ast_nodes are taking the token.value ownership (which is a really bad design since not all token.value ownership has been taken causing memory leaking) but we never free them. For a future fix we could use a string_view instead since we never change the original source code. The string_view will also improve the performance a lot avoiding unnecessary heap memory allocation. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>