Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
Get a devDuck
Rubber duck debugging has never been so cute! Get your favorite coding language devDuckBuy Now
Search - "string parsing"
Started being a Teaching Assistant for Intro to Programming at the uni I study at a while ago and, although it's not entirely my piece of cake, here are some "highlights":
* students were asked to use functions, so someone was ingenious (laughed my ass off for this one):
* "you need to use functions" part 2
*moves the whole code from main to a function*
* for Math-related coding assignments, someone was always reading the input as a string and parsing it, instead of reading it as numbers, and was incredibly surprised that he can do the latter "I always thought you can't read numbers! Technology has gone so far!"
* for an assignment requiring a class with 3 private variables, someone actually declared each variable needed as a vector and was handling all these 3 vectors as 3D matrices
* because the lecturer specified that the length of the program does not matter, as long as it does its job and is well-written, someone wrote a 100-lines program on one single line
* someone was spamming me with emails to tell me that the grade I gave them was unfair (on the reason that it was directly crashing when run), because it was running on their machine (they included pictures), but was not running on mine, because "my Python version was expired". They sent at least 20 emails in less than 2h
* "But if it works, why do I still have to make it look better and more understandable?"
* "can't we assume the input is always going to be correct? Who'd want to type in garbage?"
* *writes 10 if-statements that could be basically replaced by one for-loop*
"okay, here, you can use a for-loop"
*writes the for loop, includes all the if-statements from before, one for each of the 10 values the for-loop variable gets*
* this picture
N.B.: depending on how many others I remember, I may include them in the comments afterwards19
Waisting some times on codewars.com
3 kyu challenge:
Given a string with mathematical operations like this: ‘3+5*7*(10-45)’, compute the result
*Does a quick and easy one liner in python using eval()*
*sees people actually writing some 100 lines parsing the string and calculating using priority of operation*
(Btw, passed to lvl 4 kyu thx to this)14
Recently saw a piece of code left by a former coworker where he converted an int to a long, by converting it to a string, then parsing the string.5
Super excited to get my first ML project on, been working on this on - off for a couple of years. Now that I had sometime to spare could get it completed :) And along the way learned a lot of things as well :)
Let me know what you guys think.
A lot of thanks to this Repo that helped in parsing the image to string.
Of course PR's and suggestions really welcome and appreciated.9
I am much too tired to go into details, probably because I left the office at 11:15pm, but I finally finished a feature. It doesn't even sound like a particularly large or complicated feature. It sounds like a simple, 1-2 day feature until you look at it closely.
It took me an entire fucking week. and all the while I was coaching a junior dev who had just picked up Rails and was building something very similar.
It's the model, controller, and UI for creating a parent object along with 0-n child objects, with default children suggestions, a fancy ui including the ability to dynamically add/remove children via buttons. and have the entire happy family save nicely and atomically on the backend. Plus a detailed-but-simple listing for non-technicals including some absolutely nontrivial css acrobatics.
After getting about 90% of everything built and working and beautiful, I learned that Rails does quite a bit of this for you, through `accepts_nested_params_for :collection`. But that requires very specific form input namespacing, and building that out correctly is flipping difficult. It's not like I could find good examples anywhere, either. I looked for hours. I finally found a rails tutorial vide linked from a comment on a SO answer from five years ago, and mashed its oversimplified and dated examples with the newer documentation, and worked around the issues that of course arose from that disasterous paring.
So I ended up storing the markup (rendered from a rails partial) in an html comment of all things, and pulling the markup out of the comment and gsubbing its IDs on document load. This has the annoying effect of preventing me from using html comments in that partial (not that i really use them anyway, but.)
Every step of the way on building this was another mountain climb.
* singular vs plural naming and routing, and named routes. and dealing with issues arising from existing incorrect pluralization.
* reverse polymorphic relation (child -> x parent)
* The testing suite is incompatible with the new rails6. There is no fix. None. I checked. Nope. Not happening.
* Rails6 randomly and constantly crashes and/or caches random things (including arbitrary code changes) in development mode (and only development mode) when working with multiple databases.
* nested form builders
* styling a fucking checkbox
* Making that checkbox (rather, its label and container div) into a sexy animated slider
* passing data and locals to and between partials
* misleading documentation
* building the partials to be self-contained and reusable
* coercing form builders into namespacing nested html inputs the way Rails expects
* input namespacing redux, now with nested form builders too!
* Figuring out how to generate markup for an empty child when I'm no longer rendering the children myself
* Figuring out where the fuck to put the blank child template markup so it's accessible, has the right namespacing, and is not submitted with everything else
* Figuring out how the fuck to read an html comment with JS
* nested strong params
* nested strong params
* nested fucking strong params
* caching parsed children's data on parent when the whole thing is bloody atomic.
* Converting datetimes from/to milliseconds on save/load
* CSS and bootstrap collisions
* CSS and bootstrap stupidity
* Reinventing the entire multi-child / nested params / atomic creating/updating/deleting feature on my own before discovering Rails can do that for you.
I am so glad it's working.
I don't even feel relieved. I just feel exhausted.
But it's done.
and it's done well. It's all self-contained and reusable, it's easy to read, has separate styling and reusable partials, etc. It's a two line copy/paste drop-in for any other model that needs it. Two lines and it just works, and even tells you if you screwed up.
I'm incredibly proud of everything that went into this.
But mostly I'm just incredibly tired.
Time for some well-deserved sleep.8
A dev team has been spending the past couple of weeks working on a 'generic rule engine' to validate a marketing process. The “Buy 5, get 10% off” kind of promotions.
The UI has all the great bits, drop-downs, various data lookups, etc etc..
What the dev is storing the database is the actual string representation FieldA=“Buy 5, get 10% off” that is “built” from the UI.
Might be OK, but now they want to apply that string to an actual order. Extract ‘5’, the word ‘Buy’ to apply to the purchase quantity rule, ‘10%’ and the word ‘off’ to subtract from the total.
Dev asked me:
Dev: “How can I use reflection to parse the string and determine what are integers, decimals, and percents?”
Me: “That sounds complicated. Why would you do that?”
Dev: “It’s only a string. Parsing it was easy. First we need to know how to extract numbers and be able to compare them.”
Me: “I’ve seen the data structures, wouldn’t it be easier to serialize the objects to JSON and store the string in the database? When you deserialize, you won’t have to parse or do any kind of reflection. You should try to keep the rule behavior as simple as possible. Developing your own tokenizer that relies on reflection and hoping the UI doesn’t change isn’t going to be reliable.”
Dev: “Tokens!...yea…tokens…that’s what we want. I’ll come up with a tokenizing algorithm that can utilize recursion and reflection to extract all the comparable data structures.”
Me: “Wow…uh…no, don’t do that. The UI already has to map the data, just make it easy on yourself and serialize that object. It’s like one line of code to serialize and deserialize.”
Dev: “I don’t know…sounds like magic. Using tokens seems like the more straightforward O-O approach. Thanks anyway.”
I probably getting too old to keep up with these kids, I have no idea what the frack he was talking about. Not sure if they are too smart or I’m too stupid/lazy. Either way, I keeping my name as far away from that project as possible.4
I am currently in a bit of a (well-deserved) lull at work, both of my projects are finishing up/ finished, so tomorrow should be pretty light, as the latter half of today was.
And I have really gotten interested in the HTTP protocol. It's so interesting learning how it all works under the hood.
So I think I'm going to be researching/ messing around with creating a cpp project that essentially implements cURL from the ground up, creating sockets, reading from them, parsing the HTTP requests... all that. I don't expect to actually get it done, but it should be an immense learning experience. I have a clear goal: implement this function:
std::string get(const std::string&);
Once I'm able to just GET as simple as that, I know I have achieved my goal!3
Writing a function to take a string of delimited entities, parse each character to find the separators, capture the characters in between separators, and return an array of entities.
I used this for about a year before I learned about String.split()
I have to write an xml configuration parser for an in-house data acquisition system that I've been tasked with developing.
I hate doing string parsing in C++... Blegh!18
Im happy solving my foobar challenges and then i run verify and i get error. Hmmmm OK it might be parsing the string to int. Then i read that the number can be 10^50. WTF GOOGLE ???? ARE YOU FUCKING CRAZY ? HOW AM I SUPPOSED TO WORK WITH THIS SHIT IN JAVA ?10
I had a four month (paid!) internship this summer, being handed compete control over a custom built, naturally-grown e-commerce site (oh the stories). When I was working on rewriting the admin panel page that showed a list of orders placed, I hit PHP's date parsing... "behavior".
Despite passing a formatting string to `strtotime()` so the date from the database (MySQL VARCHAR field containing a value from `time()`, which was generated when the order was placed and doubled as the order number) would be formatted correctly, I discovered PHP basically ignores it and uses the date delimiter to determine if the date should be formatted in American or European format.
Basically, if the date uses dots or dashes for the delimiter, it is displayed in European format, DD-MM-YYYY. If the delimiter is a forward slash, the output is American, MM/DD/YYYY.
Although this is strictly an American website, this might have been fine if I did not want the date to be formatted as MM-DD-YYYY. To resolve this, I wrote the following function, including the comment, and put it in my common utils module for use where I needed it.
In almost 4 years of programming, this is probably the one thing I am most ashamed of writing.12
I like my log messages to indicate automatically where in the code something happened, so that I can easily identify where a message originated from while tracking down problems.
In C/C++ this is nice and easy - write a logging routine, wrap it in macros for the different log levels and have that automatically output __FILE__, __LINE__ etc.
I wanted to do something similar in NodeJS, as I'd found myself manually writing the file name in the log message and then splitting functionality out into new files and it became a mess.
The only way I found to be able to do this was to create an "Error" object and access the "stack" member of it. This is a string containing a stack backtrace, suitable for writing to console/file. I just wanted the filename/line/routine.
So I ended up splitting the string into lines, then for each of the lines, trimming the surrounding spaces (or tabs?), and parsing them to see if the stack entry is inside my logger module. The first entry outside of that module must therefore be the thing that called it, so I then parse out the routine or object and method, filename and line number.
It's a lot of clumsy work but the output is pretty neat. I just wish it were simpler!2
I wasn't hired to do a dev's job (handled sales) but they asked me to help the non-HQ end with sorting transaction records (a country's worth) for an audit.
Asked HQ if they could send the data they took so I wouldn't need to request the data. We get told sure, you can have it. Waits for a month. Nothing. Apparently, they've forgotten.
Asks for data again. They churn it out in 24 hours. Badly Parsed. Apparently they just put a mask of a UI and stored all fields as one entire string (with no separators). The horror!
Ended up wasting most of a week simply fixing the parsing by brute force since we had no time.
Good news(?): We ended up training the front desk people to ending their fields with semi-colons to force backend into a possibly parsed state.
I had an interesting mystery the other day. I work in the UK, but I'm working remotely from the US for a while. First day, I made some changes, ran the tests and they failed. Weird part was the failing test was for a component I hadn't touched. I took a closer look, and realized it was a date off by several hours. The test was checking that a passed in date appears in the output. But it was creating the date by parsing a string. The library I was using defaults to local time, but the component uses UTC. So, I had inadvertently created a unit test that only passes when run from UTC. But I had never noticed before because my work is in that timezone. Yikes!
Spend several months designing an alternative approach to string matching/parsing not based on parser combination or Regex. Benchmark and profile the ever living hell out of it. Find out the performance is highly competitive with FParsec and far easier on memory than either approach. Discover FParsec responds very badly when failing. Document all of this, do a presentation on the design. Upload video of presentation with links to it on a few sites. Presentation gets downvoted, and receive hatemail. Yay.18
I use the ICU format often for translation because it's simple enough and supported on many platforms. It's something of a standard so I can use the same translation string format and similar library functions everywhere.
ICU is like a really simple templating language, somewhere between printf and something like smarty or twig simplified and specifically intended for internationalisation.
I updated a library providing ICU compatible parsing and formatting for one of the platforms I'm using and find tests break. I assume that only thing to change is the API. ICU very rarely changes and if it did it would be unexpected for it to break the syntax in a major way without big news of a new syntax.
The main contributor of the library has changed since some time last year. Someone else picked up the project from previous contributors.
Though the library is heavily advertised as using ICU it has now switched to using a custom extended format that's not fully compatible and that is being driven by use case demand rather than standardisation.
Seems like a nice chap but has also decided for a major paradigm shift for the library.
The ICU format only parses ICU templates for string substitution and formatting. The new format tries to parse anything that looks XML like as well but with much more strict rules only supporting a tiny subset of XML and failing to preserve what would otherwise be string literals.
Has anyone else seen this happen after the handover of an opensource library where the paradigm shifts?3