summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-04-26lexer: Remove duplicated validationCarlos Maniero
Since there is a guard-cause checking if the token is EOF there is no need to check it again and again. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-26lexer: Split operation tokens into their own tokenCarlos Maniero
The +, -, *, and / tokens used to be TOKEN_OP, but the TOKEN_OP has been removed and a token for each operation has been introduced. Python's token names were followed: https://docs.python.org/3/library/token.html Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-25style: Use clang-format as formatter and linter toolJohnny Richard
We want to keep the code style consistent, this first commit adds a .clang-format in order to "document" our style code. This patch also adds a target *linter* to Makefile which will complain if we have any style issue on test and src dirs. I have run the follow command to create the .clang-format file: $ clang-format -style=mozilla -dump-config > .clang-format And I also made some adjusts to .clang-format changing the following properties: PointerAlignment: Right ColumnLimit: 120 Commands executed to fix the current styling: $ find . -name *.h | xargs clang-format -i $ find . -name *.c | xargs clang-format -i Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-25parser: Add support for variables and identifiers in function bodyCarlos Maniero
This commit adds support for variables and identifiers in the function body of the parser, stored as a vector. However, at this point, identifier resolution is not fully implemented, and we currently accept identifiers without checking if they can be resolved. This is a known limitation that will be addressed in a future commit once hash-tables are added to the parser. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-25cli: Create a CLI to generate an executable from pipa code.Carlos Maniero
This commit introduces a full-featured CLI that allows you to compile a file, set the gas and linker path, and define the executable output. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com> Link: https://lists.sr.ht/~johnnyrichard/pipalang-devel/patches/40642
2023-04-24util: Implement dynamic vector array for storing AST childrenJohnny Richard
Previously, we lacked a dynamic array for storing children elements in our abstract syntax tree (AST). This commit introduces a new implementation that dynamically adjusts its capacity as elements are added, using a doubling strategy. I considered two approaches for managing the vector's memory allocation: allocating it on the heap, or providing a vector_init function that allocates only the items array. Ultimately, I decided to provide a vector_new function for instantiating the vector, as this aligns with the expected usage pattern when there is a destroy function. With this new implementation, we can efficiently store and manage AST children, enabling more flexible and expressive tree structures. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21gas: Generate arithmetics expressionsCarlos Maniero
We decided for using push and pop to simplify the implementation, we want to revisit the approach latter. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Co-authored-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21ast: Create an init function for ast_binary_operation_tCarlos Maniero
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Co-authored-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21parser: Parse integers arithmetic expressionJohnny Richard
This patch implements the AST creation for arithmetic expressions. NOTE: The implementation works only for integer numbers. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Reviewed-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-20gas: Remove duplicated inst when generating exit SYSCALLJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20parser: Create the literal node typeCarlos Maniero
Since we want to extend our code to support multiple kind of expression it does not make sense that the return statement always return a number. For now on, return statement has an ast_node_t as argument, meaning that it could be anything. The literal_node_t was also implemented in order to keep the application behavior. Following the C's calling convention the literal values are stored at %eax and the return takes this argument to do anything it is needed. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20parser: Fix test name from lexer_test to parserJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20tests: Add integration testsCarlos Maniero
This tests perform the whole cycle. It takes the output from pipac compile, execute and check the returned status code. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20parser: Stop exiting on parser errorCarlos Maniero
Previously, when an error occurred during parsing, the application would exit, making it difficult to test the parser and limiting the compiler's extensibility. This commit improves the parser's error handling by allowing for continued execution after an error, enabling easier testing and increased flexibility. The parser is prepared to handle multiples errors, although the current implementation always returns a single error, it may be useful given multiples functions where we can show errors by context. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviwed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-20ast: Allows recursive nodesCarlos Maniero
Previously, the abstract syntax tree (AST) used static types, meaning that an ast_function_t would always have a ast_return_stmt_t as its body. However, this assumption is not always true, as we may have void functions that do not have a return statement. Additionally, the ast_return_stmt_t always had a number associated with it, but this too is not always the case. To make this possible, I need to perform a few changes in the whole project. One of the main changes is that there is no longer the inheritance hack. That mechanism was replaced by composition and pointers where required for recursive type reference. It is important to mention that I decided to use union type to implement the composition. There is two main advantages in this approach: 1. There is only one function to allocate memory for all kind of nodes. 2. There is no need to cast the data. In summary, this commit introduces changes to support dynamic typing in the AST, by replacing the inheritance hack with composition and using union types to simplify memory allocation and type casting. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-19tests: Include parser_parse_function testCarlos Maniero
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18lexer: Include a test function assert_token_atCarlos Maniero
With this function now it is possible to assert a token given an index. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com> Link: https://lists.sr.ht/~johnnyrichard/pipalang-devel/%3C20230418170136.3949-1-carlosmaniero%40gmail.com%3E
2023-04-18style: Fix identation on lexer.cCarlos Maniero
Co-authored-by: Johnny Richard <johnny@johnnyrichard.com> Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Link: https://lists.sr.ht/~johnnyrichard/pipalang-devel/%3C20230418165847.3798-1-carlosmaniero%40gmail.com%3E
2023-04-18lexer: Add tokenizer for OP and UNKNOWN tokensJohnny Richard
We want to tokenizer arithmetic expressions. We are handling exceptional cases with UNKNOWN token. Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com> Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18tests: Include lexer's number tokenizer testsCarlos Maniero
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18lexer: Extract tokenization functionsCarlos Maniero
make the next token function small by extracting the functions that make tokens. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18lexer: extract the lexer_drop_spacesCarlos Maniero
Extracted logic for skipping empty characters into a separate function. No change in lexer behavior. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18ast: Create AST visitor to traverse the treeJohnny Richard
In the future we want to have the possibility of traverse the tree and pretty print it or generate binary for other platform like LLVM or transpile to C. This solution also implements the gas assembly x86_64 Linux code generation by using the visitor interface. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-17style: Add .editorconfig and set {.c,.h} default settingsJohnny Richard
This .editorconfig is cross editor settings that should be repected when creating a new file. We can also make the Makefile use tabs instead spaces in a future patch. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16lexer: Extract lexer_define_literal_token_props functionJohnny Richard
This is an attempt of reducing code duplication. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16Start using string_view on lexer and parserJohnny Richard
This change fixes the memory leak when token got created. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16util: Create string_view tool to optimize memory usageJohnny Richard
We are allocating heap memory to create tokens value, we can minimize the number of allocations if we start using string_view. We have other problems, right now the tokens value ownership are quite unclear once the AST nodes also share the memory allocation done by token_get_next_token function. It's important to clarify we also have memory leaks on the current implementation. Hence, we are going to start using string_view to make the memory management easier. :^) Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16test: Add munit test frameworkJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16build: Rename make target clear to cleanJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15parser: Generate GAS 64-bit assembly for linuxJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15cli: Remove irrelevant information when loading sourceJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15parser: Show filepath row and col when parsing failsJohnny Richard
In order to find out where a parsing error occurred, this patch introduces the exactly location following the format 'file:row:col'. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15parser: Create parser for function with return statementsJohnny Richard
This is a very limited parser implementation which parses a single function with return type i32 and body containing a return number statement. The parser doesn't show the 'filepath:row:col' when it fails, a future improvement would be display it to easy find where the compilation problem is located. The ast_nodes are taking the token.value ownership (which is a really bad design since not all token.value ownership has been taken causing memory leaking) but we never free them. For a future fix we could use a string_view instead since we never change the original source code. The string_view will also improve the performance a lot avoiding unnecessary heap memory allocation. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15build: Enable warning and debug CFLAGSJohnny Richard
After enabling the warning flags, the compiler was firing the following warnings: warning: implicit declaration of function ‘strdup’; did you mean ‘strcmp’? [-Wimplicit-function-declaration] token->value = strdup("("); ^~~~~~ strcmp warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion] token->value = strdup("("); ^ In order to fix these warnings above, I have decided to replace *strdup* and *strndup* by *strcpy* and *strncpy* functions. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-14build: Add clear target to MakefileJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-14lexer: Extract lexer.c and lexer.h from pipa.cJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-14build: Move *.c to src folderJohnny Richard
We want to have different folders for src and objs files. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-14cli: Create a function to print tokensCarlos Maniero
This logic was inside the main function Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-14cli: Add missing LF on print_usageCarlos Maniero
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-14lexer: Fix string format warningCarlos Maniero
warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘size_t’ {aka ‘long unsigned int’} [-Wformat=] Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-13Create inital project structure + lexerJohnny Richard
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>