Details
-
AboutLapsed coder turned data munger and manager in a Logistics Comony in Spain
-
SkillsPython, SQL, Power Query/DAX, MS Power Platform
-
LocationBarcelona, Spain
-
Github
Joined devRant on 1/8/2019
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
-
Security is a joke. And people don't seem to get it. Especially Data mungers.
I've spent about half an hour trying to work out how to securely connect to power BI using PowerShell in a renewable manner for unattended access later on.
Every single example I've found seems to involve you storing $user and $password variables inside your script. If I'm lucky, they're going to pass them through ConvertTo-SecureString. And nobody talks about securely storing AD auth tokens, or using the Windows Credential Manager.
I know it's possible, but it's going to take me ages to work out how from all sorts of disparate sources...16 -
So for those of you keeping track, I've become a bit of a data munger of late, something that is both interesting and somewhat frustrating.
I work with a variety of enterprise data sources. Those of you who have done enterprise work will know what I mean. Forget lovely Web APIs with proper authentication and JSON fed by well-known open source libraries. No, I've got the output from an AS/400 to deal with (For the youngsters amongst you, AS/400 is a 1980s IBM mainframe-ish operating system that oriiganlly ran on 48-bit computers). I've got EDIFACT to deal with (for the youngsters amongst you: EDIFACT is the 1980s precursor to XML. It's all cryptic codes, + delimited fields and ' delimited lines) and I've got legacy databases to massage into newer formats, all for what is laughably called my "data warehouse".
But of course, the one system that actually gives me serious problems is the most modern one. It's web-based, on internal servers. It's got all the late-naughties buzzowrds in web development, such as AJAX and JQuery. And it now has a "Web Service" interface at the request of the bosses, that I have to use.
The programmers of this system have based it on that very well-known database: Intersystems Caché. This is an Object Database, and doesn't have an SQL driver by default, so I'm basically required to use this "Web Service".
Let's put aside the poor security. I basically pass a hard-coded human readable string as password in a password field in the GET parameters. This is a step up from no security, to be fair, though not much.
It's the fact that the thing lies. All the files it spits out start with that fateful string: '<?xml version="1.0" encoding="ISO-8859-1"?>' and it lies.
It's all UTF-8, which has made some of my parsers choke, when they're expecting latin-1.
But no, the real lie is the fact that IT IS NOT WELL-FORMED XML. Let alone Valid.
THERE IS NO ROOT ELEMENT!
So now, I have to waste my time writing a proxy for this "web service" that rewrites the XML encoding string on these files, and adds a root element, just so I can spit it at an XML parser. This means added infrastructure for my data munging, and more potential bugs introduced or points of failure.
Let's just say that the developers of this system don't really cope with people wanting to integrate with them. It's amazing that they manage to integrate with third parties at all...2 -
So, the PowerQuery type system appears to be a Joke.
For those you that aren't familiar with PowerQuery, it's the ETL language that is used in PowerBI, and some other parts of the MS PowerPlatform. It was formerly known as the M Language.
The language has a type system, that includes records (think hashes) and tables, which are, for practical purposes, a list of records.
The wonderful M language specification document states that:
"Any value that is a record conforms to the intrinsic type record, which does not place any restrictions on the field names or values within a record value. A record-type value is used to restrict the set of valid names as well as the types of values that are permitted to be associated with those names."
Except that the restriction is only to the set of valid names, and the language interpreter doesn't throw an error when I place a number into a text field, but also doesn't do any sort of implicit conversion. This is all hunky-dory, until you then try to load the data into the Tabular Model that underlies the query engine, which does expect the values to be of the type that is specified, and it throws an error.
But PowerBI, in its infinite wisdom, doesn't actually *record* the error, it merely tells you the error exists, and tells you to go back to the query editor to list the errors thrown up by the powerquery engine. Which, as previously stated, doesn't throw up an error for this instance.
So I've spent all afternoon trying to work out why my queries aren't loading, because I have an error that doesn't exist. fml.
[You can follow this issue on the communtiy feedback site here: https://community.powerbi.com/t5/... ] -
Power BI: wonderful tool, pretty graphics, and can do a lot of powerful stuff.
But it’s also quite frustrating when you want to do advanced things, as it’s such a closed platform.
* No way to run powerquery scripts in a command line
* Unit testing is a major pain, and doesn’t really test all the data munging capabilities
* The various layers (offline/online, visualisation, DAX, Powerquery, Dataset, Dataflow) are a bit too seamless: locating where an issue is happening when debugging can be pain, especially as filtering works differently in Query Editing mode than Query Visualisation mode.
And my number 1 pet peeve:
* No version control
It’s seriously disconcerting to go back to a no version control system, especially as you need to modify “live code” sometimes in order to debug a visual.
At best, I’ve been looking into extracting the code from the file, and then checking that into git, but it’s still a one-way street that means a lot of copying and pasting back into the program in order to roll back, and makes forking quite difficult.
It’s rewarding to work with the system, but these frustrations can really get to me sometimes2