Abstract Syntax Trees are the bane of my existence.

Ranter

atheist

10689

Comments

1

devRancid

635

1y

Why
2

djsumdog

6577

1y

..and yet, they're the core mechanic of nearly ever compiler and interpreter.
0

atheist

10689

1y

I'm writing a docstring linter for python which required parsing python into an AST, extracting the docstrings and they use reStructuredText markup so I'm now trying to parse *that* into another AST. The tools for both are mixed quality...
0

atheist

10689

1y

@djsumdog they work insanely well, but not the easiest thing to interact with...
2

lorentz

15214

1y

Ah, the *python AST* is the bane of your existence. If it's any consolation, I'm pretty sure everyone who's parsing Python is suffering from it just as much. Utterly deranged grammar. There are natural languages that are easier to parse with a program than Python.
0

atheist

10689

1y

@lorentz I have to say, astro (pylint's custom AST parser) is insanely good for my purpose. I gave up trying to use python's built in AST library, just a headache. I've got all docstrings extracted and now trying to parse restructured text. Which, honestly... Is worse...

There's the docstring_parser library which is insanely noncompliant with the standard.

There's docutils, which probably _is_ standards compliant but has zero type annotations and at best mediocre documentation for parsing.

There's sphinx which basically defines the standard. But I looked at the code, it's ugly as hell and it's overly complicated. But the standard can fit on a page or 2. So... I'm currently just writing a simple AST parser for it myself. Or at least parse the structural mark up.

This whole project started because I wanted something that would lint class attribute docstrings that are declared as string literals on the line after and couldn't find anything. "How hard could that be?!" Ahhh... So naive...
0

typosaurus

10743

1y

How are docstrings nodes. My lexer skips all that. Or maybe they filter it on parser level. You don't want a comment trough your interpreter is guess
0

typosaurus

10743

1y

I would've used regex to find function names and just from there copy docstring. Maybe you can even match group
1

lorentz

15214

1y

@retoor I lex comments, then in the parsing stage keep them if they're on item level and discard them if they're inside expressions. That way you can look up the docstrings associated with a constant or namespace, but they don't complicate the expression level
1

lorentz

15214

1y

@retoor in the built-in test runner, a test is a constant with the comment --[| test |]--

Add Comment

Abstract Syntax Trees are the bane of my existence.

rant