Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "parsers"
-
I always wanted to have comments in JSON files, but I just discovered a really good alternative:
{
"//": "This is a comment",
"somekey": "somevalue"
}
Looks kinda ugly but it works! No need for special unconventional parsers and shit.29 -
I have to write an xml configuration parser for an in-house data acquisition system that I've been tasked with developing.
I hate doing string parsing in C++... Blegh!16 -
So for those of you keeping track, I've become a bit of a data munger of late, something that is both interesting and somewhat frustrating.
I work with a variety of enterprise data sources. Those of you who have done enterprise work will know what I mean. Forget lovely Web APIs with proper authentication and JSON fed by well-known open source libraries. No, I've got the output from an AS/400 to deal with (For the youngsters amongst you, AS/400 is a 1980s IBM mainframe-ish operating system that oriiganlly ran on 48-bit computers). I've got EDIFACT to deal with (for the youngsters amongst you: EDIFACT is the 1980s precursor to XML. It's all cryptic codes, + delimited fields and ' delimited lines) and I've got legacy databases to massage into newer formats, all for what is laughably called my "data warehouse".
But of course, the one system that actually gives me serious problems is the most modern one. It's web-based, on internal servers. It's got all the late-naughties buzzowrds in web development, such as AJAX and JQuery. And it now has a "Web Service" interface at the request of the bosses, that I have to use.
The programmers of this system have based it on that very well-known database: Intersystems Caché. This is an Object Database, and doesn't have an SQL driver by default, so I'm basically required to use this "Web Service".
Let's put aside the poor security. I basically pass a hard-coded human readable string as password in a password field in the GET parameters. This is a step up from no security, to be fair, though not much.
It's the fact that the thing lies. All the files it spits out start with that fateful string: '<?xml version="1.0" encoding="ISO-8859-1"?>' and it lies.
It's all UTF-8, which has made some of my parsers choke, when they're expecting latin-1.
But no, the real lie is the fact that IT IS NOT WELL-FORMED XML. Let alone Valid.
THERE IS NO ROOT ELEMENT!
So now, I have to waste my time writing a proxy for this "web service" that rewrites the XML encoding string on these files, and adds a root element, just so I can spit it at an XML parser. This means added infrastructure for my data munging, and more potential bugs introduced or points of failure.
Let's just say that the developers of this system don't really cope with people wanting to integrate with them. It's amazing that they manage to integrate with third parties at all...2 -
Made parsers that are downloading and parsing 100 years of law data right now.
I am so happy I don’t have to read every document and just let computer do it for me.
That are great times we live in.2 -
god damn it c++, you and your ambiguous, contextual grammar!
currently working on a c++ and c parser, went from trying to use a parser generator to now writing a parser by hand.3 -
Started writing a parser for moonscript. Because I want to do my own syntax highlighting and error support.
I'm sorry, but was this supposed to difficult? Every article I read claimed this was gonna be some impossible feat of herculean effort. I half dreaded it, the other half was kinda elated.
Only it didnt live up to the hype. The tokenizer is a glorified character stream. The lexer is little more than a tokenizer, and the "most complicated" bit is nothing but a fancy transformation of the token output into a tree.
I'm completely to new parsers proper and semantic checking and maybe that's why it seemed easy, but I dont see what all the forewarning in tutorials were ever about.4 -
Depends on the project.
If it is a full application I usually start with what information it will handle.
Then either sketch out some database or some pages depending on how much info I got and if I got any good examples.
The less info I have, the more I try to focus on use cases and workflow to try to figure out what data will be needed.
But for more niche projects, like supportive library, ex parsers, I either mock some test in linqpad or look for similar examples online to flesh out the idea.
But I tend to very quickly fill out the basic shape and try to get something that can be tested.
Then I can find if I need to rethink it. -
evil === true
Found this one after 4 hours of debugging... Want to screw with other teams? Shove some UTF-8 BOM characters into JSON responses consumed by Node (and other frameworks as well). Watch as they scramble to find why JSON.parse() fails on seemingly nothing.
Background: BOM markers are hidden characters that indicate text stream information to applications. They are not ignored by many JSON parsers and throw exceptions that don't appear to make sense.1 -
wow, I have to say, tail recursion, slices, iterator transformers and shadowing are great building blocks for parsers. I'm actually way faster at defining and modifying procedural parsers with Rust's tools than I am with Chumsky. I actually understand which of my checks are greedy and lazy now!
-
1. Teach DS and Algos. Not basics but advanced data structures and the ones that are recently published.
2. DBMS should show core underlying concepts of how queries are executed. Also, what data structures are used in new tech.
3. Teach linkers, compliers and things like JIT. Parsers and how languages have implemented X features.
4. Focus on concept instead of languages. My school has a grad course for R and Java. (I can get that thing from YouTube !!)
5. Focus a little on software engineering design pattern.
6. It's a crime to let a developer graduate if he doesn't know GIT or any version control. Plus, give extra credits for students contributing to open source. Tell them if they submit a PR you get good grades. If that PR gets merged bonus (straight A may be ?)
7. Teach some design pattern and how industry write code. I am taking up a talk at school to explain SOLID design pattern.
Mostly make them build software!
Make them write code!
Make them automate their homeworks!
Make them an educated and employable student.!1 -
Took the dive and started learning kubernetes for the last 90 minutes or so. All I can say at this time... is... fuckin' hell m8!
It's some pretty damn cool tech and deconstructing the pieces to understand how to properly build on top of it has been interesting; to say the least.
but shit, man...
the amount of abstractions happening on top of docker/containerd are just asking for tons of problems hahaha. The last place I worked, we had a fair share of devs that either could not or would not bother with trying to understand docker and would constantly push code to the environments, shit would break, and then they'd come to my team and ask us to basically be human log parsers for them... how in the hell my last company is going to fare with trying to roll out kube is beyond me.
tl;dr - kubernetes has a buttload of moving targets and abstracts a metric-fuck-ton of stuff. Last company I worked for is gonna strugglepuff trying to use it. -
!rant
finally after months and months of just planning and doing boring stuff a piece of code that was really just fun to code and plan for some days:
i just wrote my first "real" parser for a simple DSL. so much fun! i just really can recommend that to everybody.
i've use a parser combinator. the concept of this parser combinator ist to combine simple parsers (like when it starts with a number or a "-" and continues with numbers then its an integer etc) into a big one. i've written it in c# and used "Sprache" first and after some time i switch to "Superpower". a really great lib, but lacks a bit of documentation. anyway, i've your're interested in these things and want learn how your "daily code" gets parsed i would recommend that to you! :)
greetings to all fellow devRanters and happy coding / parsing! :)1 -
It's sort of two separate projects although they are very tightly related.
The first is a pattern combination library and parsing engine. It takes a superficially similar approach to Regex or parser combinators, but with some important underlying differences.
The second is a specialized (not turing complete) language for rapidly defining full language grammars and parsers/lexers for those languages. -
Are there any good html parsers for C out there? I'm trying to make a terminal version of devRant.5
-
BeautifulSoup (python module) doc is a single block of text which has an everlasting scrolling and hard to read. Examples are ok, but come on, we're devs, not text parsers. We need clear, clean and visual documentation. I neither like the organization of the Facebook API docs. It was a nightmare to build my first simple app. There are tons of this kind of messy, almost unreadable and confusing docs. It's strange, but usually these kind of docs are related to open source projects. Long life to markdown and github.4
-
Ah, the elusive 31st of June - the clients favourite date. Also the DateTime parsers least favourite date.
-
Does anyone have any recommendations for command line parsers for Python? I've looked at argparse click docopt so far. I am clearly bad at making informed decisions.
-
I have to reimplement a couple of complicated OOXML parsers (docx, pptx, vsdx, etc). Actually, I’ve implemented them in Python ~5 years ago but now I have to improve them and add support for nested/embedded formats and some other stuff. As you could expect, none of the OOXML validators are valid themselves, so it's better to have an MS product installed locally, just to get reassured that everything works fine and the parser produces the format that's recognizable by M$.
So I’ve bought a key on eBay (yep, I’m not paying full price for this shit: release valid validators first, bitches; don't make me buy things I don't need). The key is valid, everything is fine. But no, you just cannot have a link to download this fucking installer, no-no-no-no. We won't give you a link until after you enter a key. FEEL DEPENDENT. OBEY.
But I digress. Here's their MANUAL about DOWNLOADING the INSTALLER:
https://support.office.com/en-us/...
So, what's wrong with it? Oh, just a minor misunderstanding. They always give you a link to download an exe-installer. Even if you use Safari.
Why everything is so fucked?2