It’s difficult to fix and not without changes in the code. Most solutions involve fixing those heavy SQL. Tuning them, caching them in redis or memcached or refactor the whole process from scratch.
Thinking on the DDoS part, implement short circuits so reaching those queries must follow a session pattern. It doesn’t stop it but you force those script kiddies to make real connections. If they are anonymous then all the heavy queries should be cached due to lack of custom vars. If not, it’s a matter of identifying users and banning them automatically.
Prometheus + node/container exporters. + Grafana for dashboards I haven’t touched zabbix in years but last time it didn’t support very well dynamic scalations. Also all of them are focused on monitoring infrastructure, you need to pay if you want APM or UX.
Enterprise level, for APM I like datadog, much better than NR. For UX we use acoustic tealeef.