summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-05pretty-printer: Remove unused fieldCarlos Maniero
Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-05cli: Add AST pretty-printing option (--ast-dump)Johnny Richard
Parsing can be a complex process, and it's not always easy to get a clear picture of what's happening with the AST. This commit adds a new feature to the CLI that allows us to pretty-print the AST (outputs to stdout), making it easier to visualize the tree structure and understand how the parser is working. The new --ast-dump option generates a human-readable representation of the AST, including node types, values, and child relationships. This information can be invaluable for debugging and understanding the parser's behavior. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Reviewed-by: Carlos Maniero <carlos@maniero.me>
2023-05-04lexer: Allows snake_case token namesCarlos Maniero
Signed-off-by: Carlos Maniero <carlos@maniero.me> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-05-04parser: Introduce statement keywordsCarlos Maniero
This commit introduces a few changes in pipalang syntax. Now, both functions and variables requires keywords to be defined. before: main(): i32 { a: i32 = 2; return a; } now: fn main(): i32 { let a: i32 = 2; return a; } Signed-off-by: Carlos Maniero <carlos@maniero.me> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-05-04munit: Show the filename as the first when errorCarlos Maniero
Munit uses the follow format: Error: file.c:line: error message But this error format does not works well with vim make program. This commit changes the error to the follow format: file.c:line: Error: error message Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-04lexer: Avoiding computation after find an EOFCarlos Maniero
When looking ahead, there was no check ensuring we reach EOF. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03cli: Rename src/pipac.c to src/main.cJohnny Richard
I found easier to understand the application entrypoint by calling it main.c. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-05-03parser: Fixes block parser memory leakCarlos Maniero
When a error occurs during a block parser the vector that stores the nodes had been destroyed but it's nodes don't. This commit fixes this by replacing the %vector_destroy% with %ast_node_destroy_vector%. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03ast: Replace init by allocation (new) functionsCarlos Maniero
All parsers have been following the patterns bellow: ast_node_t *node = ast_node_new(); ast_node_init_node_kind(node, ...args); return node; Bringing a uncessessary distraction when reading. The pattern bellow was replaced by: return ast_node_new_node_kind(...args); Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Parser allocate memory for expressionsCarlos Maniero
Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Variable assignment allocates their own nodeCarlos Maniero
This commit makes variable assignment parser to allocate memory for the node. It also moves the node initialization to the ast.c to follow our standard for node initialization. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Variable declaration allocates their own nodeCarlos Maniero
Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Split block into small functionsCarlos Maniero
Since it is possible to look a future token without consuming it, it was possible to split the block parser into small chunks of code. There is the performance drawback, because now the parser makes multiple lookups to the same token. However IMO that it is not a big concern given the small computation required to get a token. Also it can be easily addressed by computing all token in advance. Memory Leak: During the refactor I found some extra memory leaks related to not released scopes. So then, more than just printing a message I introduced an assert on scope.c to make sure developers will get this feedback asap because our testing framework suppress messages from stderr when the test passes. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Use lookahead instead of consuming tokensCarlos Maniero
Previously, during block declaration, the parser consumed the token which caused some parsers (such as return and variable declaration) to not be self-contained and to depend on the callee to start the parser. In this commit, I've refactored the parser to only look for future tokens using lookahead, and delegate the consumption to child parser functions. This results in a more modular and self-contained parser that improves the overall maintainability and readability of the code. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03parser: Refactor return statement to return an ast_nodeCarlos Maniero
During the refactoring process, I identified a memory leak where the return argument was allocated but not freed in case of an error. It also introduces the concept of keyword tokens. Where return is now a keyword simplifying the parser. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03Parser: Make the parser function return the ast_nodeCarlos Maniero
In many situations, the parser is responsible for reserving memory for nodes, particularly during function body parsing. This commit introduces a new standard where parser functions not only allocate memory for ast_nodes, but also return them. In case of a parser error, a NULL pointer is returned. This standard will be extended to other parsers in future commits, ensuring consistency throughout the codebase. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03style: Improve ast node initializationCarlos Maniero
This also removes the identifier node since it was replaced by variable. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-01parser: Implement variable assignmentJohnny Richard
This commit introduces variable assignment making it possible to change a variable value. Example: myvar: i32 = 1; myvar = 2; Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlos@maniero.me>
2023-05-01parser: Use peek and drop token when parsing expressionsJohnny Richard
2023-05-01lexer: Peek next tokenJohnny Richard
The only way to get the next token was by consuming it. So then, our parser starts to become hard to understand, once sometimes we just want to take a look on the next token to understand what should be the next kind of expression. This commit introduces a new function that will help us to improve our parser implementation. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Reviewed-by: Carlos Maniero <carlos@maniero.me>
2023-04-30style: Invert parameters order on parser_parse_typeJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30build: Add Makefile to build pipa examplesJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30gas: Optimize variable reference on assemblyJohnny Richard
We were moving the stack data for variable reference to another stack position ending up with two pointer to the same value. // a: i32 = 1; mov $1, -8(%rbp) // b: i32 = a; mov -8(%rbp), %rax mov %rax, -24(%rbp) mov -24(%rbp), %rax mov %rax, -16(%rbp) After this changes, we wont create a new temp space on stack if we don't need it. See bellow the example after the optimization: // a: i32 = 1; mov $1, -8(%rbp) // b: i32 = a; mov -8(%rbp), %rax mov %rax, -16(%rbp) Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-30style: Rename evaluation kinds on gas generatorJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-30gas: Optimize the stack utilizationCarlos Maniero
Until now, every computation was pushed onto stack witch creates unnecessary stack manipulation and makes the generated code hard to read and understand. Now, the latest computation is stored and could be either a literal or a value on a register. When it is a register we may need to push the value to stack to avoid data loss. Now if it is a literal, hence, we can just set the value onto a register. example/main.pipa before this commit: .global _start .text _start: push %rbp mov %rsp, %rbp mov $69, %rax ; <- There is no reason to store data in rax mov %rax, %rdi mov $60, %rax syscall pop %rbp example/main.pipa after this commit: .global _start .text _start: push %rbp mov %rsp, %rbp mov $69, %rdi ; <- Fixed! mov $60, %rax syscall pop %rbp example/variables.pipa before this commit: .global _start .text _start: push %rbp mov %rsp, %rbp mov $12, %rax mov %rax, -8(%rbp) mov $32, %rax mov %rax, -16(%rbp) mov -8(%rbp), %rax mov %rax, -32(%rbp) mov -16(%rbp), %rax mov -32(%rbp), %rcx add %rcx, %rax mov %rax, -32(%rbp) mov $2, %rax mov -32(%rbp), %rcx mul %rcx mov %rax, -24(%rbp) mov $1, %rax mov %rax, -40(%rbp) mov $33, %rax mov %rax, -48(%rbp) mov -24(%rbp), %rax mov -48(%rbp), %rcx sub %rcx, %rax mov -40(%rbp), %rcx add %rcx, %rax mov %rax, -32(%rbp) mov $2, %rax mov %rax, -40(%rbp) mov -32(%rbp), %rax mov -40(%rbp), %rcx xor %rdx, %rdx div %rcx mov %rax, %rdi mov $60, %rax syscall pop %rbp example/variables.pipa after this commit: .global _start .text _start: push %rbp mov %rsp, %rbp mov $12, -8(%rbp) mov $32, -16(%rbp) mov -8(%rbp), %rax mov %rax, -32(%rbp) mov -16(%rbp), %rax mov -32(%rbp), %rcx add %rcx, %rax mov %rax, -32(%rbp) mov -32(%rbp), %rcx mov $2, %rax mul %rcx mov %rax, -24(%rbp) mov -24(%rbp), %rax mov $33, %rcx sub %rcx, %rax mov $1, %rcx add %rcx, %rax mov %rax, -32(%rbp) mov -32(%rbp), %rax mov $2, %rcx xor %rdx, %rdx div %rcx mov %rax, %rdi mov $60, %rax syscall pop %rbp Less 8 instructions! Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30gas: Compile variable expression with scope supportJohnny Richard
This patch adds the variable compilation and uses a scope (a stack of map) to lookup for identities. Today we use a vector + ref_entry structs in order to achieve the scope implementation. The ref_entry lacks memory management, we are still no sure who will be the owner of the pointer. We also want to replace the scope a hashtable_t type as soon as we get one. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-30polish: Remove unnecessary token creation when dropping tokenJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30ast: Rename variable and variable_declaration correctlyJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30make: Add linter-fix target to MakefilesJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30parser: Registry identifiers on scopeJohnny Richard
We are parsing variables/functions and checking if they are defined on scope. Otherwise we fail the parsing with a nice message. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-30style: Add void to function without argumentsJohnny Richard
After run `make CC=clang` we found the following problems: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-30style: Add -Wmissing-declarations to CC CFLAGSJohnny Richard
The refactoring also replace a if statement by switch statement. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-29ast: Introduce ast_identifier_t for named ast nodesCarlos Maniero
Prior to this change, ast_variable_declaration_t and ast_function_declaration_t used a string_view as an identifier. However, to support scoped identifiers, it is more appropriate to use an ast_identifier_t as a reference. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Co-authored-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-29scope: Add a scope stack for identifier resolutionsCarlos Maniero
Before accepting an identifier, the parser should check if that identifier will be available. With this implementation it will be possible. Take the following code example: main(): i32 { return my_exit_code; } The parser must return an error informing that *my_exit_code* is not defined in the example above. The ast scope is a support module for parser and ast, simplifying identifier resolution. Once a curly bracket ({) is open the *scope_enter()* is called and when it is closed (}) we pop the entire stack with *scope_leave()*. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-29ast: Remove ast visitor pattern to simplify the codeJohnny Richard
I decided to remove the visitor pattern due to the lack of Object Oriented Programming support for C. Now if you want to navigate through the AST, you should do it with switch case and recursion. The code looks way simpler without visitor pattern. I have added a CFLAG -Werror which validates if the switch statement covers all branches for a given enum at compile time. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-26ast: Include a Binary Operation kind enumCarlos Maniero
The AST was using a string view to distinguish the operation kind. An enum was created for this purpose simplifying code generation. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-26lexer: Remove duplicated validationCarlos Maniero
Since there is a guard-cause checking if the token is EOF there is no need to check it again and again. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-26lexer: Split operation tokens into their own tokenCarlos Maniero
The +, -, *, and / tokens used to be TOKEN_OP, but the TOKEN_OP has been removed and a token for each operation has been introduced. Python's token names were followed: https://docs.python.org/3/library/token.html Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-25style: Use clang-format as formatter and linter toolJohnny Richard
We want to keep the code style consistent, this first commit adds a .clang-format in order to "document" our style code. This patch also adds a target *linter* to Makefile which will complain if we have any style issue on test and src dirs. I have run the follow command to create the .clang-format file: $ clang-format -style=mozilla -dump-config > .clang-format And I also made some adjusts to .clang-format changing the following properties: PointerAlignment: Right ColumnLimit: 120 Commands executed to fix the current styling: $ find . -name *.h | xargs clang-format -i $ find . -name *.c | xargs clang-format -i Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-25parser: Add support for variables and identifiers in function bodyCarlos Maniero
This commit adds support for variables and identifiers in the function body of the parser, stored as a vector. However, at this point, identifier resolution is not fully implemented, and we currently accept identifiers without checking if they can be resolved. This is a known limitation that will be addressed in a future commit once hash-tables are added to the parser. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-25cli: Create a CLI to generate an executable from pipa code.Carlos Maniero
This commit introduces a full-featured CLI that allows you to compile a file, set the gas and linker path, and define the executable output. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com> Link: https://lists.sr.ht/~johnnyrichard/pipalang-devel/patches/40642
2023-04-24util: Implement dynamic vector array for storing AST childrenJohnny Richard
Previously, we lacked a dynamic array for storing children elements in our abstract syntax tree (AST). This commit introduces a new implementation that dynamically adjusts its capacity as elements are added, using a doubling strategy. I considered two approaches for managing the vector's memory allocation: allocating it on the heap, or providing a vector_init function that allocates only the items array. Ultimately, I decided to provide a vector_new function for instantiating the vector, as this aligns with the expected usage pattern when there is a destroy function. With this new implementation, we can efficiently store and manage AST children, enabling more flexible and expressive tree structures. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21gas: Generate arithmetics expressionsCarlos Maniero
We decided for using push and pop to simplify the implementation, we want to revisit the approach latter. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Co-authored-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21ast: Create an init function for ast_binary_operation_tCarlos Maniero
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Co-authored-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21parser: Parse integers arithmetic expressionJohnny Richard
This patch implements the AST creation for arithmetic expressions. NOTE: The implementation works only for integer numbers. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Reviewed-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-20gas: Remove duplicated inst when generating exit SYSCALLJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20parser: Create the literal node typeCarlos Maniero
Since we want to extend our code to support multiple kind of expression it does not make sense that the return statement always return a number. For now on, return statement has an ast_node_t as argument, meaning that it could be anything. The literal_node_t was also implemented in order to keep the application behavior. Following the C's calling convention the literal values are stored at %eax and the return takes this argument to do anything it is needed. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20parser: Fix test name from lexer_test to parserJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20tests: Add integration testsCarlos Maniero
This tests perform the whole cycle. It takes the output from pipac compile, execute and check the returned status code. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20parser: Stop exiting on parser errorCarlos Maniero
Previously, when an error occurred during parsing, the application would exit, making it difficult to test the parser and limiting the compiler's extensibility. This commit improves the parser's error handling by allowing for continued execution after an error, enabling easier testing and increased flexibility. The parser is prepared to handle multiples errors, although the current implementation always returns a single error, it may be useful given multiples functions where we can show errors by context. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviwed-by: Johnny Richard <johnny@johnnyrichard.com>