• 0 Posts
  • 64 Comments
Joined 1 year ago
cake
Cake day: June 17th, 2023

help-circle
  • Big if true. Current academic evaluation system via poor KPIs is more similar to a “call center” evaluation system than something useful. It is the main reason most of the best young scholars I know have left academia for industry. Everyone I knew in academia good enough to find something else elsewhere left, those who couldn’t remained, and are burn out, stuck in the rat race.

    It is sad, because we were all ready to accept less money and being overworked for the idea to do something useful. And society lost so much value due to a broken corrupted system that squashes human dignity to make money for small mafias, institutions and publishers








  • VSCode supports it also for other shells. This repo is not about vscode, it’s about actual shells. You are the one incorrect in this case

    When someone brings points in the discussion, you react like a fan boy student that just bought the new gaming laptop.

    Could you please reply about the discussion or go back to school? I am too old for your “no, you shit, you stoopid”

    I wonder myself why I keep answering to your comments and that’s why this is my last comment












  • A markov chain models a process as a transition between states were transition probabilities depends only on the current state.

    A LLM is ideally less a markov chain, more similar to a discrete langevin dynamics as both have a memory (attention mechanism for LLMs, inertia for LD) and both a noise defined by a parameter (temperature in both cases, the name temperature in LLM context is exactly derived from thermodynamics).

    As far as I remember the original attention paper doesn’t reference markov processes.

    I am not saying one cannot explain it starting from a markov chain, it is just that saying that we could do it decades ago but we didn’t have the horse power and the data is wrong. We didn’t have the method to simulate writing. We now have a decent one, and the horse power to train on a lot of data