We passed a milestone: 250,000 phpunit testcases.

If it weren't for a heavily parallelized build pipeline which splits it out over 20 servers, it would take about 7.5 hours to complete.

Not hating on PHP, and without tests it would truly be hell...

But still, fucking hell, we outgrew PHP.

Not having a solid type system just means you either accept more bugs, or write thousands of unit tests to guard all the foundational cracks in the system.

On the bright side, I get a coffee break after every commit 😄

  • 3
    well done!
  • 7
    Holy shit! That’s insane!

    Wouldn’t it make more sense to switch to a better language instead?
  • 3
    Your testing every php function + your own code base?
  • 4
    @Lensflare Yeah definitely.

    I used to be a Haskell dev, and I do also write Rust these days.

    Both of them require only occasional tests, because the rest is guarded by the type system.
  • 3
    I'd be really interested in more details.

    Is the codebase large or is it just testing every function a thousand times?

    That number of tests would be problematic in all languages :)
  • 2
    @bittersweet the main problem with switching language is catching any odd edge case or special handling.

    But that is true for any rewrite where you cannot reuse existing unit tests.
  • 2
    Yep, you can cut at least 90% of them when using a modern Hindley-Milner-based type system.
  • 2
    Holy cow
  • 1
    7h for each commit? Could you modularize it more?
  • 0
    @WildOrangutan Yeah locally I only run parts of it. In CI/CD it takes about 20 minutes because it's distributed over servers.
  • 1

    Not completely sure how large the codebase is (well over 20M loc), but there are actually many untested parts.

    It's split up into "megaservices" as I like to call them, over a hundred "microservices" which aren't completely decoupled or deployed separately, some living in the same repository but communicating through events, others pulled in through git modules —still all blobs together into a giant monolith.

    One of the things I'm currently doing on a daily basis is writing small "truly decoupled" services in Rust, which communicate with the monster through Redis, and then surgically remove the deprecated PHP code. Financial services which benefit the most from increased performance and safety, mostly.
  • 3

    Makes sense.

    But that explains it pretty well.

    Surgically removing is definitely a wise choice, beasts of this size are - no matter how, no matter what - always a challenge.

    Meaning that no architecture could hold that in a maintainable manner.

    Though decoupling in smaller microservices at least shifts the burden to keeping containers / libraries up to date aka maintenance and finding a way to dance the limbo of resource management for the ever growing pile of (decoupled) microservices xD

    Reason I like microservices - though the architecture has a lot of downsides and the hype was as always pretty much ignorant of the downsides, it at least shifts the problems to task that could be automated.
  • 1

    Yeah for giant projects there are some advantages, although I'm not that strict about the "micro", it's more about having "a separate service for each domain".

    Our invoicing codebase is still enormous, but by splitting it off the rest of the codebase you have an "internal product" with a clearly defined API of sorts, and developers can maintain it without understanding the rest of the code.

    But yeah, the complexities are often underrated.

    For stuff which takes time, do you poll for completion, or send some event back? How to build for failure handling, retries, outages, etc?

    Anyone who has worked with third party APIs knows the headaches of "The weather API is returning 500 errors" -- "Yeah maybe we should cache the last known API result and use that? And send a Slack message to some channel when the API is down?".

    If you suddenly introduce those same forms of overhead internally within the company all over the place, it's not all smooth sailing.
  • 1
    Can't type hints help with that? Sounds like you can toss a lot of cases that way.
  • 1
    @bittersweet Ay yup.

    Even better when the third API plays pseudo random clusterfuck.

    Aka inconsistent return values / return http codes...

    Your definition of micro is right in my opinion.

    Micro is a misleading term and though some claim that it's definition must include "replaceable in XY months"...

    Estimates for working hours are guess work at best.... Using such a subjective value for a definition is just bullshit.

    What's more important is that it follows or comes close - as you said - that the service should do one thing / handle one domain and only this domain.

    Rest is just following best practices.

    Decoupling, stateless if possible, finding proper and sane ways to deal with the mass of data, always have scaling in mind, proper versioning and dependency handling.
  • 0
    @hjk101 Yeah partially.

    Good linters/inspection/static analysis helps as well.

    But still, Laravel for example kind of forces you to use plenty of dirty PHP magic.

    And while PHP type hinting is getting better and better, many libraries will ask for or return "mixed" as a type, and null handling isn't perfect either.
  • 1

    I think for us the metric would be "replaceable in x developer suicides"
  • 1
    Crikey! What does it actually do? Run a small country?
  • 0
  • 1
    @bittersweet we have not really gone all the way to micro services yet but are on the track.

    And I think its more a design philosophy than a rulebook.

    And micro all depends.

    I think the first criteria is micro for purpose, as in single purpose services.

    That purpose could require a lot of code but should expose a reasonably simple api.

    And its should always feel right for you.
  • 1
    @ultrageoff Approximately.

    Don't want to be "too easily findable" here, but yeah, it's basically a large consultancy contractor, mainly for governmental bodies.

    We have enormous piles of data on infrastructure, housing, labor, traffic, education, income, etc. When you know everything about everyone... you can churn out creepily detailed reporting and analytics to advice public transport, universities, road construction, housing projects, environmental institutes, etc.

    While I work mostly on "the PHP monster" which provides inter- & intranet pages, oAuth dashboard panels for various contractors and layers of gov, finance backends etc -- There's also a large Python/Tensorflow department which churns through enormous datasets.

    So yeah, technically, this PHP monster is just "the human interface" to an equally large Python AI brain monster...

    Which does in fact kind of run a small country.
  • 1
    Do you work at Facebook?
Add Comment