pipac.git - Pipa programming language

Age	Commit message (Collapse)	Author
2023-05-11	gas: implement recursion and late evaluation	Carlos Maniero
	Until now the below code was not valid for pipac. fn main(): i32 { return fib(13); } fn fib(n: i32): i32 { if n <= 1 { return n; } return fib(n - 1) + fib(n - 2); } Pipa's parser was adding a function to scope after they were fully parsed which means that fib's self-reference would not work. Also, functions were required to follow the be called in the order they are declared for the same scope reason so, the main function was required to be defined after fib. And how it is working now? When a TOKEN_NAME is not found in the scope, instead of returning an error, an unknown token is created as placeholder. The parser stores the node reference and the token it was trying to parse. During type checks, if the parser detects an unknown node, instead of returning an error, it stores in that node what was the expected type. After the NS is fully parsed a reevaluation is made on those unknown nodes by setting the lexer back on the node's token position and parsing the TOKEN_NAME again. Ps: There is a typo on the unknown token. It will be addressed in another commit since this issue was not introduced by this change. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-10	gas: implement function calls	Carlos Maniero
	For now function calls are following the C's calling convention, which means they are using the following registers to pass functions' arguments: rdi, rsi, rdx, rcx, r8, r9 If a function has more then 6 parameters, the compilation will fail. To enable function with more than 6 parameters we will need to save the extra arguments on stack. Naming: parameters: function parameters are the variables a function receives. arguments: Arguments are the values passed to a function when calling it. Calling mechanism: When a function is called, all the expressions passed as argument are evaluated, after the evaluation, the result is stored on the register that represents its argument position, the first argument will be stored on rdi, the second on rsi and so on. Receiving mechanism: When a function starts, the first thing it does is store all the registers onto the stack. So rdi will be stored on -8(rbp), rsi on -16(rbp) and so on. And, a ref_entry is created making the relationship parameter-stack_offset. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-09	parser: parses an if statement no code generation	Carlos Maniero
	This commit parses a if statement following the grammar bellow: if boolean_expression { n_epressions; } No else neither code generation was implemented. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-09	parser: Add the bool type	Carlos Maniero
	This commit introduces a new type for booleans. There is no code generation for this type yet. The intention of this commit is to enable flow control in the near future. Signed-off-by: Carlos Maniero <carlos@maniero.me> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-05-06	lexer: Tokenize logical and bitwise operators	Carlos Maniero
	The followed logic operators were added to lexer: TOKEN_EQUAL == TOKEN_NOT ! TOKEN_NOT_EQUAL != TOKEN_GT > TOKEN_GT_EQUAL >= TOKEN_LT < TOKEN_LT_EQUAL <= TOKEN_AND && TOKEN_OR \|\| Bitwise operators were also added TOKEN_BITWISE_AND & TOKEN_BITWISE_OR \| TOKEN_BITWISE_SHIFT_LEFT << TOKEN_BITWISE_SHIFT_RIGHT >> TOKEN_BITWISE_XOR ^ TOKEN_BITWISE_NOT ~ TOKEN_EQUAL '=' was renamed TOKEN_ASSIGN, and now TOKEN_EQUAL is used for the logical comparator '=='. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-04	lexer: Allows snake_case token names	Carlos Maniero
	Signed-off-by: Carlos Maniero <carlos@maniero.me> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-05-04	parser: Introduce statement keywords	Carlos Maniero
	This commit introduces a few changes in pipalang syntax. Now, both functions and variables requires keywords to be defined. before: main(): i32 { a: i32 = 2; return a; } now: fn main(): i32 { let a: i32 = 2; return a; } Signed-off-by: Carlos Maniero <carlos@maniero.me> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-05-04	lexer: Avoiding computation after find an EOF	Carlos Maniero
	When looking ahead, there was no check ensuring we reach EOF. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03	parser: Use lookahead instead of consuming tokens	Carlos Maniero
	Previously, during block declaration, the parser consumed the token which caused some parsers (such as return and variable declaration) to not be self-contained and to depend on the callee to start the parser. In this commit, I've refactored the parser to only look for future tokens using lookahead, and delegate the consumption to child parser functions. This results in a more modular and self-contained parser that improves the overall maintainability and readability of the code. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-03	parser: Refactor return statement to return an ast_node	Carlos Maniero
	During the refactoring process, I identified a memory leak where the return argument was allocated but not freed in case of an error. It also introduces the concept of keyword tokens. Where return is now a keyword simplifying the parser. Signed-off-by: Carlos Maniero <carlos@maniero.me>
2023-05-01	parser: Use peek and drop token when parsing expressions	Johnny Richard

2023-05-01	lexer: Peek next token	Johnny Richard
	The only way to get the next token was by consuming it. So then, our parser starts to become hard to understand, once sometimes we just want to take a look on the next token to understand what should be the next kind of expression. This commit introduces a new function that will help us to improve our parser implementation. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Reviewed-by: Carlos Maniero <carlos@maniero.me>
2023-04-26	lexer: Remove duplicated validation	Carlos Maniero
	Since there is a guard-cause checking if the token is EOF there is no need to check it again and again. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-26	lexer: Split operation tokens into their own token	Carlos Maniero
	The +, -, *, and / tokens used to be TOKEN_OP, but the TOKEN_OP has been removed and a token for each operation has been introduced. Python's token names were followed: https://docs.python.org/3/library/token.html Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
2023-04-25	style: Use clang-format as formatter and linter tool	Johnny Richard
	We want to keep the code style consistent, this first commit adds a .clang-format in order to "document" our style code. This patch also adds a target linter to Makefile which will complain if we have any style issue on test and src dirs. I have run the follow command to create the .clang-format file: $ clang-format -style=mozilla -dump-config > .clang-format And I also made some adjusts to .clang-format changing the following properties: PointerAlignment: Right ColumnLimit: 120 Commands executed to fix the current styling: $ find . -name .h \| xargs clang-format -i $ find . -name .c \| xargs clang-format -i Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-25	parser: Add support for variables and identifiers in function body	Carlos Maniero
	This commit adds support for variables and identifiers in the function body of the parser, stored as a vector. However, at this point, identifier resolution is not fully implemented, and we currently accept identifiers without checking if they can be resolved. This is a known limitation that will be addressed in a future commit once hash-tables are added to the parser. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-21	parser: Parse integers arithmetic expression	Johnny Richard
	This patch implements the AST creation for arithmetic expressions. NOTE: The implementation works only for integer numbers. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com> Reviewed-by: Carlos Maniero <carlosmaniero@gmail.com>
2023-04-20	parser: Stop exiting on parser error	Carlos Maniero
	Previously, when an error occurred during parsing, the application would exit, making it difficult to test the parser and limiting the compiler's extensibility. This commit improves the parser's error handling by allowing for continued execution after an error, enabling easier testing and increased flexibility. The parser is prepared to handle multiples errors, although the current implementation always returns a single error, it may be useful given multiples functions where we can show errors by context. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviwed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18	style: Fix identation on lexer.c	Carlos Maniero
	Co-authored-by: Johnny Richard <johnny@johnnyrichard.com> Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Link: https://lists.sr.ht/~johnnyrichard/pipalang-devel/%3C20230418165847.3798-1-carlosmaniero%40gmail.com%3E
2023-04-18	lexer: Add tokenizer for OP and UNKNOWN tokens	Johnny Richard
	We want to tokenizer arithmetic expressions. We are handling exceptional cases with UNKNOWN token. Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com> Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18	lexer: Extract tokenization functions	Carlos Maniero
	make the next token function small by extracting the functions that make tokens. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-18	lexer: extract the lexer_drop_spaces	Carlos Maniero
	Extracted logic for skipping empty characters into a separate function. No change in lexer behavior. Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com> Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16	lexer: Extract lexer_define_literal_token_props function	Johnny Richard
	This is an attempt of reducing code duplication. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-16	Start using string_view on lexer and parser	Johnny Richard
	This change fixes the memory leak when token got created. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15	parser: Show filepath row and col when parsing fails	Johnny Richard
	In order to find out where a parsing error occurred, this patch introduces the exactly location following the format 'file:row:col'. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15	parser: Create parser for function with return statements	Johnny Richard
	This is a very limited parser implementation which parses a single function with return type i32 and body containing a return number statement. The parser doesn't show the 'filepath:row:col' when it fails, a future improvement would be display it to easy find where the compilation problem is located. The ast_nodes are taking the token.value ownership (which is a really bad design since not all token.value ownership has been taken causing memory leaking) but we never free them. For a future fix we could use a string_view instead since we never change the original source code. The string_view will also improve the performance a lot avoiding unnecessary heap memory allocation. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-15	build: Enable warning and debug CFLAGS	Johnny Richard
	After enabling the warning flags, the compiler was firing the following warnings: warning: implicit declaration of function ‘strdup’; did you mean ‘strcmp’? [-Wimplicit-function-declaration] token->value = strdup("("); ^~~~~~ strcmp warning: assignment to ‘char ’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion] token->value = strdup("("); ^ In order to fix these warnings above, I have decided to replace strdup* and strndup by strcpy and strncpy functions. Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
2023-04-14	lexer: Extract lexer.c and lexer.h from pipa.c	Johnny Richard
	Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>