Age | Commit message (Collapse) | Author |
|
Signed-off-by: Carlos Maniero <carlos@maniero.me>
Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
This commit introduces a few changes in pipalang syntax. Now, both
functions and variables requires keywords to be defined.
before:
main(): i32 {
a: i32 = 2;
return a;
}
now:
fn main(): i32 {
let a: i32 = 2;
return a;
}
Signed-off-by: Carlos Maniero <carlos@maniero.me>
Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
When looking ahead, there was no check ensuring we reach EOF.
Signed-off-by: Carlos Maniero <carlos@maniero.me>
|
|
Previously, during block declaration, the parser consumed the token
which caused some parsers (such as return and variable declaration) to
not be self-contained and to depend on the callee to start the parser.
In this commit, I've refactored the parser to only look for future
tokens using lookahead, and delegate the consumption to child parser
functions. This results in a more modular and self-contained parser that
improves the overall maintainability and readability of the code.
Signed-off-by: Carlos Maniero <carlos@maniero.me>
|
|
During the refactoring process, I identified a memory leak where the
return argument was allocated but not freed in case of an error.
It also introduces the concept of keyword tokens. Where return is now a
keyword simplifying the parser.
Signed-off-by: Carlos Maniero <carlos@maniero.me>
|
|
|
|
The only way to get the next token was by consuming it. So then, our
parser starts to become hard to understand, once sometimes we just
want to take a look on the next token to understand what should be the
next kind of expression.
This commit introduces a new function that will help us to improve our
parser implementation.
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
Reviewed-by: Carlos Maniero <carlos@maniero.me>
|
|
Since there is a guard-cause checking if the token is EOF there is no
need to check it again and again.
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
|
|
The +, -, *, and / tokens used to be TOKEN_OP, but the TOKEN_OP has been
removed and a token for each operation has been introduced. Python's
token names were followed: https://docs.python.org/3/library/token.html
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
Reviewed-by: Johnny Richard <johnny@johnnyrichar.com>
|
|
We want to keep the code style consistent, this first commit adds a
.clang-format in order to "document" our style code.
This patch also adds a target *linter* to Makefile which will complain
if we have any style issue on test and src dirs.
I have run the follow command to create the .clang-format file:
$ clang-format -style=mozilla -dump-config > .clang-format
And I also made some adjusts to .clang-format changing the following
properties:
PointerAlignment: Right
ColumnLimit: 120
Commands executed to fix the current styling:
$ find . -name *.h | xargs clang-format -i
$ find . -name *.c | xargs clang-format -i
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
This commit adds support for variables and identifiers in the function
body of the parser, stored as a vector.
However, at this point, identifier resolution is not fully implemented,
and we currently accept identifiers without checking if they can be
resolved. This is a known limitation that will be addressed in a future
commit once hash-tables are added to the parser.
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
This patch implements the AST creation for arithmetic expressions.
NOTE:
The implementation works only for integer numbers.
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
Reviewed-by: Carlos Maniero <carlosmaniero@gmail.com>
|
|
Previously, when an error occurred during parsing, the application
would exit, making it difficult to test the parser and limiting the
compiler's extensibility. This commit improves the parser's error
handling by allowing for continued execution after an error, enabling
easier testing and increased flexibility.
The parser is prepared to handle multiples errors, although the
current implementation always returns a single error, it may be
useful given multiples functions where we can show errors by context.
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
Reviwed-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
Co-authored-by: Johnny Richard <johnny@johnnyrichard.com>
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
Link: https://lists.sr.ht/~johnnyrichard/pipalang-devel/%3C20230418165847.3798-1-carlosmaniero%40gmail.com%3E
|
|
We want to tokenizer arithmetic expressions.
We are handling exceptional cases with UNKNOWN token.
Co-authored-by: Carlos Maniero <carlosmaniero@gmail.com>
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
make the next token function small by extracting the
functions that make tokens.
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
Extracted logic for skipping empty characters into a
separate function. No change in lexer behavior.
Signed-off-by: Carlos Maniero <carlosmaniero@gmail.com>
Reviewed-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
This is an attempt of reducing code duplication.
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
This change fixes the memory leak when token got created.
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
In order to find out where a parsing error occurred, this patch
introduces the exactly location following the format 'file:row:col'.
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
This is a very limited parser implementation which parses a single
function with return type i32 and body containing a return number statement.
The parser doesn't show the 'filepath:row:col' when it fails, a future
improvement would be display it to easy find where the compilation
problem is located.
The ast_nodes are taking the token.value ownership (which is a really
bad design since not all token.value ownership has been taken causing
memory leaking) but we never free them. For a future fix we could use a
string_view instead since we never change the original source code. The
string_view will also improve the performance a lot avoiding unnecessary
heap memory allocation.
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
After enabling the warning flags, the compiler was firing the following
warnings:
warning: implicit declaration of function ‘strdup’; did you mean ‘strcmp’? [-Wimplicit-function-declaration]
token->value = strdup("(");
^~~~~~
strcmp
warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
token->value = strdup("(");
^
In order to fix these warnings above, I have decided to replace *strdup*
and *strndup* by *strcpy* and *strncpy* functions.
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|
|
Signed-off-by: Johnny Richard <johnny@johnnyrichard.com>
|