8

!rant...

...i am actually scared about posting this one... because... well, i've mentioned that language idea that i've been mucking around with "designing"... and... i have grand ideas, but no idea if i understand stuff and dev needs and stuff well enough to be doing what i'm doing right now in trying to put it into lang design....

...and posting it here is throwing myself into lion's den with almost nothing, and risting shame when someone who knows this stuff looks at it and laughs at me, realizing that it's utter bullshit that has no idea what's it doing, a perfect dunning-kruger example...

...and this fear is reinforced by the fact that the whole thing is still (about 5 years after i've been mucking around with it mildly) very much in flux containing lots of things i'm not sure about, undecided about, don't know enough about, don't realize the implications of, etc etc...

but... let's try it.
let's link this thing and let you probably tear me to shreds =D
(ignore the c# project, that's the exmaple of what i was talking about regarding the parser, bullshit that kinda spins out into self-referential circles because although i understand the parser and interpreter theory, I wasn't able to transform any of it into practice yet)
https://github.com/sh-code/AsmOs

Comments
  • 0
    please be at least a bit merciful in your skilled educated cruelty, thank you :)
  • 0
    ...most of it has been written while high or drunk or tired as a way to unwind, or both or all three O:-)
  • 0
    p.s. I don't use github. like, at all. i don't do opensource, in any way, all of what i do is proprietary and all of what i play around with in my free time is on my local HDD, the crap in there is my few immediately abandoned attempts to sync local and remote, or whatever... please don't judge me based on it :'(
  • 0
    @Root
    not sure if and how much this is your area, but if you had a bit of time and will to look at it, I would be very grateful for your feedback.
  • 0
    @Root if you know someone better qualified to look at this (while also posessing the infinite patience of those magical beings who can actually deal with self-educated half-wits reaching too high) feel free to tag them. thank you <3
  • 1
    I haven't taken a very good look at it yet, but one thing stood out to me immediately: if you want the code to run natively you're going to need to write a runtime for it for memory management stuff.

    You mention you understand the concepts but weren't able to implement the AST and whatnot. I suggest you take a look at craftinginterpreters.com, it's great and it does talk about compilers, not only interpreters. It dips your toes on compiler and interpreter implementation and also points you in the right direction if you want to learn more. Spoiler: it teaches you how to build an interpreter from start to finish.

    Disclaimer: I'm by no means knowledgeable in language design or compiler implementation, I just know a few things here and there from trial and error and a bit of reading.
  • 1
    Oh and go easy on yourself, no one's born knowing everything :)
  • 1
    Another piece of advice: writing a spec for your language will help you tremendously, you'll probably find a few problems with your design when writing the spec. It doesn't have to be anything super formal though, a "tour" of the language should already be enough.
  • 0
    @neeno
    "You mention you understand the concepts but weren't able to implement the AST and whatnot."

    you misunderstood. i think i understand enough to be able to fantasize about it on this level, but I am aware i don't understand enough (yet) to be able to chisel it out into actually good design, and implement even on just a level of vm.

    what is AST? please link entrypoint to topic, thank you.
  • 1
    @neeno
    "f you want the code to run natively you're going to need to write a runtime for it for memory management stuff."

    yes, i know (i think).
    it should/would be part of the vm/runtiíme stuff, would somewhat run on a different idea of memory management (i'm not in the state to even try to explain now) which could be emulated with current paradigm hardware, but also implemented in new hardware and would be (hopefully) at least much more straightforward.

    but... this is one of those areas where i'm hugely out of my depth, just playing around with some vague notions/wants of what i would like it to be from user/API consumer angle, and a vague idea it should also be achievable in implementation...

    ...vague being the often-repeated keyword.
  • 1
    @Midnight-shcode oh, I see. Well I can't help with that because I haven't managed to write one in a clean and organized way yet XD

    An AST is an Abstract Syntax Tree. Put simply, it describes how the expressions and statements in the parsed code relate to each other. https://en.wikipedia.org/wiki/...
    If you look at the picture on the right and the code below it you should already get the gist of it.

    You really should read the book in craftinginterpreters.com, it'll probably help you a great deal.
  • 1
    @neeno
    i'm doing mostly top-down process:
    1. this is how i want the thing to be used/written.
    2. ...(magic happens?)
    3. this is what i want (and think is achievable) the thing to be tech-wise, according to my very llimited cursory knowledge that it should be possible and why it would be SO COOL.
  • 1
    @neeno
    "Disclaimer: I'm by no means knowledgeable in language design or compiler implementation, I just know a few things here and there from trial and error and a bit of reading."

    same here, but even from those bits, some idea of "my ideal language" starts to emerge, and that repo is basically my attempt to define what it is.
  • 1
    @neeno
    "It doesn't have to be anything super formal though, a "tour" of the language should already be enough."

    since i'm going for the top-down (usage->implementation) direction, i'm trying to use the "examples" as that kind of a tour.

    clarifying even for myself how I want the usage of it to look, because that's my primary concern right now.
  • 1
    @Midnight-shcode

    "since i'm going for the top-down (usage->implementation) direction, i'm trying to use the "examples" as that kind of a tour."

    I get that, but you still should write a little spec for it, it helps you think. When designing my own language I was thinking of how the code should look like but when I wrote the spec I noticed some things that wouldn't work.
  • 1
    @neeno
    "An AST is an Abstract Syntax Tree."

    oh, yeah, that guy.
    yes. that's a critical part of this idea, and (i think) one of the most unusual ones in how it should be handled.

    which is why my numerous attempts so far have ended in embarassing failures.

    i have the structure and its logic partly in my head but wasn't able to express any part of it correctly in code yet.
  • 1
    @neeno
    "I get that, but you still should write a little spec for it, it helps you think. "

    as soon as i will start to be firmly decided on things, i will put them into spec, yes. but for now, even in the repo, notice first line comment - some stuff is few months old and looks syntactically very different.

    again, that's why I do "examples" now, to figure out "is this how i would want to use and write this concept?", i'm not even properly beyond this stage yet.

    as soon as things start to solidify, i'll do proper specs.
  • 1
    again, that's why I do "examples" now, to figure out "is this how i would want to use and write this concept?", i'm not even properly beyond this stage yet.

    and in a way, this is probably what I most want to hear feedback/thoughts on, because i would like the language not only be the most natural and breezy for me, but also potential other users
  • 0
    so probably "how would I [express this concept]?" questions would help me the most in this stage
  • 1
    case s don't fall through? :( Man, I know no one will miss that but I will.
  • 2
    @c3r38r170 case fallthrough is a huge wart that should have never happened to begin with.
  • 1
    As for the language, looks like another sample of what I would consider to be the consolodation of languages into a lingua franca - you're using a lot of concepts that are common among many recent languages (including the one I'm working on too).

    So by that metric, it's on the track.

    There are a few syntactic things I'm a little confused of - my personal philosophy of good neo-language design is that it should convey intent without knowing the intracacies of the language, and that includes both syntax and semantics.

    That being said, a few things stand out as being a bit vague:

    - No idea what thr(position => ...) { ...} is - very strange syntax

    - Is `pub <identifier>` a namespace declaration? How do modules work then, in terms of the filesystem hierarchy/discovery/resolution?

    - Are your arrays linked lists or vectors of some sort? Or fixed size/pointers? Representing these things in Javascript are trivial since that's taken care of for you, but translating to (1/n)
  • 1
    @c3r38r170
    yeah, there still exists a cathegory of users, regarding some features, to whom I syntax says: "screw you, write less spaghetti" =D

    that's another thing, syntax (besides numerous variants for omitting optional stuff) is rigid, including whitespaces. you write something->it gets parsed to tokens->it gets composed back to text with rigid newline and space and tab/nesting info.

    it's kind of a part of the "syntactically specific/rigid as c#" thing.

    the only syntactic multiples to write the same semantic thing should be those which add back in the keywords that could have been ommitted due to defaults.
    ( [priv] int privIntPropertyInClass )
  • 1
    machine code will definitely force you to make these decisions (re-allocating memory is a potentially expensive operation, so you generally allocate once and never resize if you can help it).

    - You have a comment that says "main namespace/entry point/block" - well, which is it? Those are three very, _very_ different things.

    - If `pub <ident>` is a function declaration, why isn't `main` publicized in your example? As per most system ABIs, it needs to be an exported symbol (well, technically _start does, but you'd have an exception even if this is the case).

    - If the above is true, then how do you annotate parameter types and return types?

    - You have a `.js` in the extension, so `=>` being a form of assignment is very confusing if trying to "mentally parse" this as a language relating to Javascript. (2/n)
  • 1
    @c3r38r170
    case; break; <- same thing. default is you don't want them to fall through, so fuck "break".

    maybe an explicit "do fall through" instruction would be nice for those perverts like you.

    i propose the keyword "buttfuck;"
  • 1
    - Please don't allow optional bracing for conditionals/loops. It's another one of those warts from C that shouldn't be proliferated.

    - Are types first-class values? What are the semantics surrounding passing them to functions, as in `console.readLine(int)`? Do they augment the return type? What happens when I pass a runtime value? Does that mean you have the ability to overload with statically-evaluated expression and runtime expressions in different overloads? That's going to be a bit hellish to implement as a systems language (I should know, I'm doing something similar right now).

    - what is the type of `1 .. 50` by itself? Does it only make sense in the syntax of `[1 .. 50]` or can I also subscript with the range operator (e.g. `some_var[2..4]`)? Can I assign it to something? If not, what does it expand to, strictly? if it expands to the full range of numbers, what if the range exceeds the amount of memory on the host machine? does that mean I can do `some_var[1, 2, 3, 4]`?
  • 1
    Mandatory reading if you haven't already come across it:

    https://eev.ee/blog/2016/...
  • 1
    And for reference: https://gist.github.com/Qix-/...

    That's what I'm working on compiling right now. Been a 7 year project. You can probably spot some similarities.

    Also look into Zig, as it's another one of those "lingua franca" samples I mentioned before.
  • 0
    @junon
    "- No idea what thr(position => ...) { ...} is - very strange syntax"

    "goes to". "binds to"- simple equals is "calculate the thing and assign to variable as soon as you encounter the line.

    "=>" is "bind this expression to the variable and on each access of the variable, calculate it at that moment using currenr scoped-in variables.

    "- Is `pub <identifier>` a namespace declaration? How do modules work then, in terms of the filesystem hierarchy/discovery/resolution?"

    the idea is that namespace/class/function/property is the same difference. all of it gives you a name under which is a value. i call it "block". can be named, unnamed, take parameters or not. the semantic distinction seems... arbitrary/useless to me so i'm trying to get rid of it, but i'm most likely still missing some implications of it.
    (for example constructors are a mess)
  • 0
    What is the `thr` bit then?
  • 1
    > arbitrary/useless to me so i'm trying to get rid of it, but i'm most likely still missing some implications of it.

    You're going to create a lot of work for yourself when lowering to target IR doing this.

    Most IR libraries work in terms of blocks, yes, but functions are also a distinct primitive you will ultimately have to work with.

    Also, unless you're working with fully self-contained programs that do not export symbols (which means, no static/shared libraries), you'll have to work with the concept of functions in order to build ELF executables that work with other programs, or even with the host system.

    Also, understanding which scopes certain things like statements, variables, etc. can go in, and what their semantics are, is imperative when working with a language that runs on bare metal. These semantics should be really hammered down and clear.
  • 0
    @junon "- You have a comment that says "main namespace/entry point/block" - well, which is it? Those are three very, _very_ different things."

    the idea is that namespace/class/function/property is the same difference. all of it gives you a name under which is a value. i call it "block". can be named, unnamed, take parameters or not. the semantic distinction seems... arbitrary/useless to me so i'm trying to get rid of it, but i'm most likely still missing some implications of it.
    (for example constructors are a mess)

    imagine the code being written out in real time manually and that resulting in the environment (the program) being built up line by line.

    you can do it live in editor/interpreter (again, the same thing for this paradigm) or you can load a text file which will be parsed in the same way.

    does that answer your question?
  • 0
    @junon
    "- If `pub <ident>` is a function declaration, why isn't `main` publicized in your example? As per most system ABIs, it needs to be an exported symbol (well, technically _start does, but you'd have an exception even if this is the case)."

    good point.
    main block should definitely be
    pub main.

    --

    BLOCK, btw. that's part of the different paradigm:
    block is either code that evaluates to value upon call, or evaluates to value upon parsing.
    again, the "=" (equals, has value) and "=>" "binds an execution of bit of code to generate value upon call" difference comes in.
  • 0
    @junon
    "- If the above is true, then how do you annotate parameter types and return types?"

    too drunk and high to understand what you're asking, will get back to that later.

    "- You have a `.js` in the extension, so `=>` being a form of assignment is very confusing if trying to "mentally parse" this as a language relating to Javascript. (2/n)"

    as the README.MD in the repo states: the final .js is there just because the default js parsing and highlighting seems the most sensible/unobnoxious for this new language that doesn't have its own proper parser/syntax highligher yet.
  • 0
    @Midnight-shcode Not really, no. How the program is inputted to the compiler isn't of any relevance to the semantics of the language.

    Ultimately, the backend will have primitives such as blocks and functions. How do your "blocks" map to the backend blocks/functions?
  • 0
    @junon
    "As for the language, looks like another sample of what I would consider to be the consolodation of languages into a lingua franca - you're using a lot of concepts that are common among many recent languages (including the one I'm working on too)."

    FUCKYEAH, FUCKIN SUCCESS AT LEAST ON THIS BIT OF THE FRONT!!! <3

    it's supposed to be the least verbose "pseudocode" combining modern terse syntax and semantic concepts, yes. <3

    js' "object is dictionary is map is namespace is... fuck off, it's just a named block of stuff", but hopefully avoiding all of the JS trapdoors that make it so horribly annoying.
  • 0
    "There are a few syntactic things I'm a little confused of - my personal philosophy of good neo-language design is that it should convey intent without knowing the intracacies of the language, and that includes both syntax and semantics."

    i agree and the fact that you're only confused of "a few" syntactic things, and only by "a little", hints to me that so far i'm not too far off.

    ...of course it has one or two paradigm-specific things of its own. the question is not whether they're there and need to be leaned, but whether they make sense and are useful and contributing to the rest of the intended core features.
  • 0
    @junon
    "- Are your arrays linked lists or vectors of some sort? Or fixed size/pointers? Representing these things in Javascript are trivial since that's taken care of for you, but translating to (1/n)"

    no frikkin idea yet. I would love them to behave largely the same way that JS object/dictionary/array/namespace bullshit blobs behave, but with clearer syntax around working with it and defining it...

    implementation?
    again, welcome to the horribly fuzzy "i FEEL there's a way to implement it, which is not a complete shit" area.
  • 0
    @junon
    "- Are your arrays linked lists or vectors of some sort? Or fixed size/pointers? Representing these things in Javascript are trivial since that's taken care of for you, but translating to (1/n)"

    but so far i'm not even solid on how i want to interface with that weird- but-not-weird-but-i-don't-know semantic concept.
  • 0
    @junon
    "- Are your arrays linked lists or vectors of some sort? Or fixed size/pointers? Representing these things in Javascript are trivial since that's taken care of for you, but translating to (1/n)"

    it's like... should be like...
    loading the program is the same as parsing the program and then stepping through the memory linearly and executing the instruction is the same as running the program... so the memory organization and structure is/should be subservient to that(?)

    ...that's one of those high and far level fuzzy ideas that i think should be possible-
  • 0
    @junon
    - Please don't allow optional bracing for conditionals/loops. It's another one of those warts from C that shouldn't be proliferated.

    yeah i've been fighting with this. not allowing it gives the very concrete clear consistency which i want, but allowing it allows to "simplify by omitting all the defaults".
    not a solved thing yet, and i very much understand your view and share it about 50% of the time when working on the syntax
  • 1
    If your arrays are lists then you'll have to accept that your language simply won't be as efficient as other systems languages. ;)
  • 0
    @junon
    "- Are types first-class values? What are the semantics surrounding passing them to functions, as in `console.readLine(int)`?"

    OMG YES that's an incredibly huge and complicated question, back to you when not insanely high and drunk. trailer: there is a bit of (new-paradigm) reasoning behind what i'm trying to do by the syntax you see there.
    hopefully.
  • 0
    @junon
    "- what is the type of `1 .. 50` by itself? Does it only make sense in the syntax of `[1 .. 50]` or can I also subscript with the range operator (e.g. `some_var[2..4]`)?"

    in-place generator? if that name makes any sense?
    parser encounters:
    1 .. 4
    and expands it to
    [1, 2, 3, 4]
    before assigning to whatever was on the left-hand side.

    something like that is the idea, but again, several angles fighting each other here.

    for a whiile i was thinking about any program in this as a... text where result of each operation replaces the operation and the whole program "reduces down in-place" to the result, but... that's too insane for me to be able to think through (and therefore also too insane for people to use normally).

    this "generator expression" is a tiny bit of an artifact from that phase.
  • 0
    @junon
    or can I also subscript with the range operator (e.g. `some_var[2..4]`)?

    that would resolve into (literal source code of)
    some_var[1, 2, 3, 4] which doesn't make much sense. but where it makes sense is:

    int[] sequence = [1 .. 4];

    where it resolves into
    int sequence = [1, 2, 3, 4];

    which is an in-place definition of 4item int array, which:
    if it's assigned with "="
    int sequence = [1, 2, 3, 4];

    gets resolved during "parse-run", and now the "sequence" contains that array literal, or if assigned with "=>"
    int sequence => [1 .. 4];

    links the generator expression to the variable, and when someone attempts to retrieve the value, the generator expression gets evaluated at that time, and returns its result as value of the variable.
  • 0
    @junon
    "What is the `thr` bit then?"

    stupid shortened keyword for even stupider non-shortened keyword:

    through(){
    }

    basically a "forEach" kind of loop

    forEach( item in bla.items){

    }

    through(item => whatever generates/provides list of items to iterate through){
    //item currentItem
    }

    but based on the ideology of "core keywords should be shortcuts so they are fast and noncluttering to read. one can memorize those few abbreviations, let's keep the long descriptive words for actual class/variable names", so

    thr(int index => [0 .. 49]){
    //does 50 iterations on index var being 0 to 49
    }
  • 0
    @junon
    "Most IR libraries work in terms of blocks, yes, but functions are also a distinct primitive you will ultimately have to work with."

    but i don't see the logic that way.

    "class" is a function that contains "pub <vartype>" declarations. their default values are whatever is set while the class runs for the first time.

    again a bit of maybe stupid paradigm difference attempts, which i only fuzzily think/feel should work and would make sense once a person is used to it.
  • 0
    @junon
    "How the program is inputted to the compiler isn't of any relevance to the semantics of the language."

    which is why pascal can't call a function before it's defined in the text, right? ;)
  • 1
    I highly recommend writing up a more formal spec of the language. I'm talking full on ebnf and everything.

    It will make it so much easier to find bugs in the grammar and allow others to writer parsers and compilers for your language.
Add Comment