• 0 Posts
  • 14 Comments
Joined 7 months ago
cake
Cake day: January 27th, 2025

help-circle
  • It’s actually not that hard. Most of these bots are using a predictable scheme of headless browsers with no js or minimal js rendering to scrape the web page. Fully deployed browser instances are demonstrably harder to scale and basically impossible to detect without behavioral pattern detection or sophisticated captchas that also cause friction to users.

    The problem with bots has never rested solely on detectability. It’s about:

    A. How much you inconvenience the user to detect them

    B. Impacting good or acceptable bots like archival, curl, custom search tools, and loads of other totally benign use cases.


  • It certainly is not negligble compared to static site delivery which can breezily be cached compared to on-the-fly tarpits. Even traditional static sites are getting their asses kicked sometimes by these bots. And yoy want to make that worse by having the server generate text with markov chains for each request? The point for most is reducing the sheer bandwidth and cpu cycles being eating up by these bots hitting every endpoint.

    Many of these bots are designed to stop hitting endpoints when they return codes that signal they’ve flattened it.

    Tarpits only make sense from the perspective of someone trying to cause monetary harm to an otherwise uncaring VC funded mob with nigh endless amounts of cache to burn. Chances are your middling attempt at causing them friction isn’t going to, alone, actually get them to leave you.

    Meanwhile you burn significant amounts of resources and traffic is still stalled for normal users. This is not the kind of method a server operator actually wanting a dependable service is deploying to try to get up and running gain. You want the bots to hit nothing even slightly expensive (read: preferably something minimal you can cache or mostly cache) and to never come back.

    A compromise between these two things is what Anubis is doing. It inflicts maximum pain (on those attempting to bypass it - otheriwse it just fails) for minimal cost by creating a small seed (more trivial than even a markov chain – it’s literally just an sha256) that a client then has to solve a challenge based on. It’s nice, but certainly not my preference: I like go-away because it leverages browser apis these headless agents dont use (and subsequnetly let’s js-less browsers work) in this kind of field of problems. Then, if you have a record of known misbehavers (their ip ranges, etc), or some other scheme to keeo track of failed challeneges, you hit them with fake server down errors.

    Markov chains and slow loading sites are costing you material just to cost them more material.




  • If I had a nickel for every time someone ignored me just to say something I directly address…

    You are pretty blatantly referencing X11 Forwarding / Network Transparency.

    I can’t reasonably assume you actually read anything I say, but to briefly reiterate:

    Checkout Waypipe. Here’s a direct quote from the README:

    Waypipe is a proxy for Wayland clients. It forwards Wayland messages and serializes changes to shared memory buffers over a single socket. This makes application forwarding similar to ssh -X feasible.

    Have you tried this? What is disatisfactory about it? And if all else fails, is there really ANY problem with simply using VNC/etc? What real-world problem do you have that is uniquely solved with this?


  • What? I’ve gotten RDP, VNC, and SPICE working fine on Wayland. And if you need app-level displays then waypipe worked fine the last time I used it. I’ve been running Proxmox containers with Wayland just fine, too.

    Any particular use case that benefits from what Xorg was uniquely capable of networking-wise (network transparency, afaik?) of is quite niche and development effort twoards that end has always reflected that!

    I’ve not been able to find the git or project repo/writeup of “Wayland on Wires”. Though i do vaguely feel like I saw it somewhere.

    But I suppose me and my ongoing computer science degree and shared family hobby of IT simply hasn’t reached Real Linux User levels yet. I must sharpen my Bash Blade for another 1000 years…

    Since that’s the case, I suppose I must defer to your Infinitely Endless Wisdom as a True Linux User. I beg of thee, answer my Most Piteous Questions…:

    1. What do you use Xorg’s networking functionality for?
    2. What is ““real”” Linux work?
    3. Why can’t you use Wayland for that?
    4. Have you heard of Waypipe? Have you used it?

  • “Linux as a desktop is BAD!”

    “Evidence?”

    “I failed to make a slideshow in a buggy application. :'(”


    In all seriousness, though, wtf? You could have pulled from any of the well-know papercuts and instead you balk about a broken application? Lmao?

    My vibeo gaem crashed on Windows once. I guess I should hold Microsoft personally accountable for it…


    For the record, I’ve used Linux throughout Highschool, Community College, and College. No issues with basic software functionality, really.

    The worst and only issue I’ve had in that regard is self-inflicted because I decided to run LibreOffice via Wayland, which has an ongoing bug that makes scrolling laggy. That’s it.

    The larger issues with Linux as a desktop is software compat (Wine) with Windows for nicher use cases (requires debugging and a bunch of setup), certain drivers (cough cough Nvidia cough cough), and general dumbass-proofing.





  • That’s a lot better than it could be. But I’m also talking about training costs. Models have to be updated to work swimmingly with new languages, conventions, libraries, etc. Models are not future-proof.

    There are more efficient training methods being employed. See: the stuff R1 used. And existing models cam be retooled. But it’s still an intrinsic problem.

    Perhaps most importantly it’s out of the reach of common consumer grade hardware to train a half decent LLM from scratch. It’s a tech that exists mostly in the scope of concentrated power among peoole who care little for their enviromental ramifications. Relying on this in the short term puts influence and power in the hands of people willing to burn our planet. Quite the hard sell, as you might imagine.

    Also see: the other points I made



  • Energy and water costs for developmenr and usage alone are completely incompatible with that. Come back in 20 years when it’s not batshit insane ecologically.

    Not to mention reducing power usage of programs isnt going to be very feasible based on simply an LLM’s output. LLMs are biased twoards common coding patterns and those are demonstrably inefficient (if the scourge of web apps based on electron is any tell). Thusly your code wouldn’t work well with lower grade hardware. Hard sell.

    Theoritically they could be an efficient method of helping build software in the future. As it is now that’s a pipe dream.

    More importantly, why is the crux of your focus on not understanding the code you’re making. It’s intrinsically contrived from the perspective of a solarpunk future where applications are designed to help people efficiently - without much power, heat, etc… weird man