• 1 Post
  • 9 Comments
Joined 6 months ago
cake
Cake day: January 3rd, 2024

help-circle
  • It was a mixture of factors.

    Data was to be dumped into a S3 bucket (minio), this created an event and anouther team had built an orchestrator which would do a couple of things but eventually supply an endpoint a reference to a plain/txt file for analysis.

    For the Java devs they had to [modify the example camel docs.](https://camel.apache.org/manual/rest-dsl.html) and use the built in jackson library to convert the incoming object to a class. This used the default AWS S3 api to create a stream handle which fed into the OpenNLP docs. .

    The Python project first hit a wall in setting up Flask. They followed the instructions and it didn’t work from setup tools. The Java team had just created a new maven project from the Intelij but the same approach didn’t work for the Python team using pycharm. It lost them a couple days, I helped them overcome it.

    Then they hit a wall with Boto3, the team expected to stream data but Boto3 only supports downloading, there was also a complexity issue the AWS SDK in Java waa about 20 lines to setup and a single line to call, it was about 50 lines in Python. On the positive side I got to explain what all the config meant in S3.

    This caused the team anouther few days of delay because the team knew I used a 350MiB Samsung TV guide to test the robustness and had to go learn about Docker volume mounting and they thought they needed a stateful kubernetes service and I had to explain why that was wrong.

    Basically Python threw up a lot of additional complexity and the docs weren’t as helpful as they could have been.



  • You do, but considering the scales they process data I suspect Google would be better building Go tooling (or whatever the dominate internal language is).

    A few years back I was trying to teach some graduates the importance of looking at a programming language ecosystem and selecting it based on that.

    One of my comparison projects was Apache OpenNLP/Camel vs Flask/Spacy.

    Spacy is the go to for NLP, I expected it to be either quicker to develop, easier to use, better results or just less resources.

    I assigned Grads with Java experience Spacy and Python experience OpenNLP.

    The OpenNLP guys were done first, they raved about being able to stream data into the model and how much simplier it made life.

    When compared with the same corpus (Books, Team emails, corporate sharepoint, dev docs, etc…) OpenNLP would complete on 4GiB of RAM in less than a second on 0.5vCPU. Spacy needed 12GiB and was taking ~2 seconds with 2vCPU. They identified the same results…

    Me and a few others ended up spending a day reading the python and trying to optimise it, clearly the juniors had done something daft, they had not.

    It rather undermined my point.


  • My expectation is whatever the solution it needs to dockerise and be really easy to deploy via docker compose or Kubernetes so people can quickly and easily set up their own.

    The front end is effectively static files so I would probably choose Apache or Express (whichever gives me a smaller docker image)…

    For the backend I would choose Java for Spring Boot. An Alpine image with OpenJDK and the app is tiny. Spring has a library for every kind of interface making them trivial to implement but the main reason is hibernate.

    Hibernate (now Spring Data) was the first library for being able to switch out databases without having to change code (its all config). A lot of mastodon instances struggle with the resource requirements of elastic search so letting small instances use something like postgres would seem ideal.

    I have noticed Go/Rust still expect you to write or manage a lot of stuff Spring gives away for free. Python is ok if your backend is really tiny but there is a lot of boilerplate in how Python libraries work so complex projects get hard to manage and I assume interacting with the fediverse will add complexity.


  • SpaceX are on track to launch 130 times this year. The industry competitors launch 6-12 times per year.

    I suspect the higher incident rate is related, since you will need to manufacture, roll out, etc… much more often.

    The article also talks about most the incidents being in booster recovery. Only 2 Space competitors do that,

    Blue Origins sub orbital booster always returned to launch site and at best managed monthly launch. This rocket hasn’t launched in more than a year.

    Rocket Lab perform ocean recovery but only launched 11 times last year and only recovered the booster twice.

    So its hard to really compare




  • Technical Leads are not rational beings and lots of software is developed from an emotional stand point.

    Engineering is trade offs, every technical decision you make has a pro/con.

    What you should do is write out the core requirements/constraints.Then you weigh the choices to select the option that best meets it.

    What actually happens is someone really likes X framework, Y programming language or Z methodology and so decides the solution and then looks for reasons to justify it.

    Currently the obvious tell is if they pitch Rust. I am not saying Rust is bad, but you’ll notice they will extoll the memory safety or performance and forget about the actual requirements of the project.


  • It isn’t a good move.

    A domain name can cost as little as £10, similarly most email services cost ~£5-£15 per person per month. Its normally pretty easy to link a domain to an email provider and doesn’t cost anything other than time.

    If a company can’t be bothered to implement the most basic online branding people will make their assumptions and some will filter your company out because of it. With the cost to implement so low (e.g. £160 per year), even the loss/gain of a single customer would justify it.