16
billgates
73d

Random question that apparently screwed up even Google...

how does an IDE evaluate (and format) code and errors so quickly?

Was thinking maybe building a compiler/programming language would be the best way to learn/practice algorithm... Is it?

Comments
  • 3
    I actually want to know like exactly how they detect errors, because I have an idea for basically a rubber duck that'll fucking glow red if my code has any errors. Obviously I'd need to know how to detect errors if I want to to do it though.
  • 3
    You are thinking about linters. I think all languages have them. You just feed it a file and it returns warnings, errors and what not.
  • 1
    @24th-Dragon so how do you code one and how does it work/parse/identify issues so quickly?
  • 1
    Have u looked at neovim? not because of vim necessarily but just general customization seems to be somewhat well documented and open to build on top of.

    I started making attempts to install it and sway/wayland on a slackware vm the other day. Haven’t been successful so far but that’s what i intend to use it for when i’m done.
  • 1
    @M1sf3t hmm... It's in C... And we'll lots of pieces. Sorta want to know how IDEs can parse and highlight, understand, format the code so quickly.
  • 11
    Its basically the first half of a compiler, the parser that reads the code and generates an ast (abstract syntax tree).

    When doing this it can detect errors and depending on how good it is and if it has rules for how to recover after the first error, for example, restart with next line or word until it finds something that seems like valid structure, it can detect multiple errors.

    On top of that it can use rules on the tree to detect things like uninitialized variables by walking the tree and checking if it finds any usage before setting and so on.

    Its not trivial but also not magic :)

    Also, the parsing categorizes all elements which can be used for highlighting.

    Some simple highlighting can even be done using simple pattern matching but thats rarely enough by todays standards.
  • 0
    @billgates you can build a basic one by using flex / yacc. I dont know exactly how ides do them. They probably keep in memory a table of global variables, classes etc in your code and run the lint only on files that ware changes.
  • 0
    @Voxera yes I was thinking about a parser as well at first. How would a program check if some text is valid html, json, etc. And parse it into a tree... So quickly. Have to match all the closing Tags and locate and parse all the children which also have children...
  • 0
    @24th-Dragon well then the question would be how does lint work then. It has to understand and validate the text.
  • 4
    @billgates computers are fast ;)

    I have built my own scripting language using a handwritten lexer and parser and it parses a page of texts in a few milliseconds.

    Its not as powerful when it comes to error reporting, it basically just returns the first but for my purposes its enough.

    I also know that IDE’s usually use some extra tweaks to avoid reparsing all text, it just reparses the current statement or line depending in language.

    And it also often can highlight just from the lexer which is much simpler and works on each token.
  • 4
    A lexer “tokenizes” the input into words and operators (+-. and such).

    The parser looks for patterns like

    An ‘if’ should be followed by an parentesed expression then a body.

    An expression ..

    And so on.

    An add expression is a pair of number or variable(identifier) with a + sign in the middle.

    Any whitespace is usually removed by the lexer.
  • 0
    @Voxera yes I googling it. So I guess taking html/xml as an example. First it just tokenizes everything in a flat way. And then based on the ordering of each token "sorts" the tokens into a parent/child structure?
  • 2
    @billgates more or less but html has a few extra problems.

    First its not as strictly structured with multiple elements that can be either self closing or paired with content, like the p tag.

    Also, many pages has broken html with tags out of order or missing so there are a whole set of common practices that html parsers use to repair and clean up.

    Also its not just html but css and javascript.

    And on top of that unstructured text which presents its own set of problems.

    So I would recommend using something like htmlagilitypack or similar for parsing html.

    Or use it to repair the document before parsing it with your own so that you can assume its all valid and well structured.

    I use it and then pull out the tree half way ;) then apply my parser to all attribute values and pure text.

    My old one just ignored any html but I still had to handle unstructured text and that is best handled early on with a set of good separators around your code, thats why most template libraries use {{ }} or [[ ]].
  • 0
    @Voxera yes i usually use parsers like htmlagility, gson, json .net but sorta had a thought this maybe something interesting to explore, learn. Sorta thinking what's separating me and those Google/CS guys... What exactly can they do intuitively that I can't?
  • 0
    Sorta thinking maybe this is the way for me to Ken algo D's. I don't like toy problems but if I'm actually building something that's usable and needs this stuff there I'll learn it because I need to rather then just to answer some stupid technical interview question
  • 2
    @billgates go for it, it’s an interesting set of problems with many layers of improvements possible :)
  • 1
    The first step is define the grammar of the language (semantics).
    Then you build a parser; you can build one from scratch yourself or use a generator like gnu bison.
    Then, once you have your source code parsed and stored you can write rules on it to search for common error, that is the basis of a linter.

    If you want the good stuff then there's static analysis that adds a whole lot of theory behind and allows you to find bugs that you can not describe with a syntactic rule.

    It's a very interesting field in my opinion and is quite important when you're writing critical code, like the one for planes, spacecrafts or very big high quality project, although it can be much helpful even on smaller projects. Take a look at infer from Facebook, for example.
  • 0
    @Voxera ++ for actually answering the question :)
  • 1
    You would be served well by a book that covers the subject of programming language translation.
  • 0
    @billgates what’s wrong with it being in C? I was taking this post to related to comments made earlier by irene and tmpnull in your js thread and not to be as random as you implied. I realize neovim is not a 💯 ide straight out of the box but that’s what i was getting at. Just ignore the vim craze for a second. Syntax highlighting is built in already and there plenty of documentation on how to cater them to the task at hand. You’ve also go neomake that will asynchronously lint and if it’s anything like neoV then that’s going to have documentation on how to set up that environment as well.

    You say you don’t like toy projects so i’m assuming your a bit like me, you come up with something worthwhile to do then dive headfirst into it, attempting to learn it from the ground up. what better way to start than with C in an environment that can be broken down and learned piecemeal?
  • 0
    @billgates i mentioned my own endeavor in the off chance that you might want to take part. you seem to essentially have the same goal, learn programming from the ground up. Your miles ahead of me I’m sure, but i’m a quick study.

    Not only that I tend to ask the questions that no one else does and it often leads to everyone involved getting broader understanding. My past instructors have either loved or hated me for it because it always pushed them to find out shit that they didn’t know already.

    Left on my on this usually has me infinitely chasing down random tangents and it’s not uncommon to find me attempting to study three different concepts or languages all at once. It helps to have someone around to ask if they got the same thing that i did from the documentation.

    In any case when i say slackware i mean zenwalk or more specifically the first four packages of zenwalk which is quite minimalist beyond the cli music player. the idea is to set up an os strictly...
  • 0
    @billgates for use as an ide. it’ll have neovim, a basic mindmap/kanban, a vector editor(probably inkscape if it works ok with way/xway), and a simple, lightweight browser to display any code that’s web-related. Succeed in that and the next step is code mirror and realtimeDB like with firebase’s firepad.
  • 0
    @M1sf3t I never the source. code/project structure is hard to go through and I couldn't locate what I was looking for (a parser/linter).

    I tried learning C++ but had sort of the same issue. I had no use for it since whatever I wanted to code, I could already do in other languages I already know.
  • 0
    @billgates you know I like neovim but I'm not necessarily sold on developing it specifically, something inspired by it would do. The default key layout won't be much vim-like at all anyway. Not that there won't be a legacy version but the key mappings are going to be for those that grew up on wolfstein and doom i.e awsd as opposed to hkjl or however the fuck that lines up, I can never keep k and j straight. Tho for me personally its been so long since I had the money to keep up with computer gaming, I'm going with jikl for movement since thats the hand I'm used to using for arrow keys. But you get the idea.

    Also moving all the fucking movement keys to one hand, who the hell thought w and e would be good keys to move entire words was worried to much about the fucking letters matching. Right now I've got , as end of word and period as beginning of the next(makes sense to me 🤷🏻‍♂️) putting the repeat action on the other hand with the rest of the action commands....
  • 2
    @billgates may I recommend the Compilers book by Aho and Ullman? It answers stuff like this (in addition to @Voxera 's excellent answers).
  • 0
    @billgates it took me so long figure out just the vimrc though I got aggravated and decided to dive into a linux project in the hopes of getting the hang of the finer points of the directory and just the command line in general.

    But your right, the general docs are fucked. The reason I took so long with the init file wasn't because I didn't understand the tutor, they just took to long to get to the key remap and I ventured off looking for how to do it in the help docs. I had to link to 3 pgs, not in order, to get the full instruction. No reference to the back command, not even a link back to the first pg. I ended up saving the fucking tutor as my init the first go round.

    Anyway what I'm getting at is that I meant for you to duck it and see whats available online from the general public. So far I've managed to find what I needed. I'm about to dive into the syntax part tho so we'll see how long it lasts
  • 0
    @RememberMe my attention span has gotten so bad I have to get my computer to read it to me so that I can fidget with a drawing or something while I listen 🤦🏻‍♂️
  • 1
    @M1sf3t heh
    Well, I would recommend doing that then, because there's a lot of knowledge in these books that just isn't available otherwise. Your average internet tutorial or whatever doesn't even come close to the depth and rigour these books reach. They can be pretty hardcore though so expect to budget a fair amount of time.

    Just putting it here in case anyone's interested, other good books on this topic would be Formal Languages and Automata Theory by Linz (the theoretical framework behind grammars etc.), Engineering a Compiler by Cooper & Torczon (modern compiler techniques), and Linkers and Loaders by Levine (old but pretty awesome).
  • 1
    @RememberMe oh i'm not knocking books at all, honestly I miss sitting down with one. I had it in my head when I got back form across the pond to finish school and be a war correspondent, but then I stuck my nose into politics and quickly realized how much of a circus the media in general had become.

    Hard copies, especially with technical documentations get expensive though and so more often than not I just crawl the web for free .pdf doc which is probably part of the problem. I customized a terminal window the other day to give me a more paper like setting to read on, it seemed to help copy/pasting them in neoV keeping them txt files. Thought I might try to dig out the old nook I had in school and see if it worked.

    I got the last book you recommended me bookmarked on here somewhere. I tried to search I books for it... don't know if you saw my post about siri recommending the Holy Bible instead 😂
Your Job Suck?
Get a Better Job
Add Comment