Fun With Local Ml Models

Discussion in 'Lounge' started by freshage, Jun 4, 2026 at 1:08 PM.

  1. For those nerds who like the machine learning space and actually tinker/build/research in said space, I've been having a giggle with this model - https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive

    Normally I run the qwen 3.6 27B dense model for my day to day which fits nicely on my main GPU. With a smaller MoE model loaded up on a second, smaller GPU for agentic work. The dense model only has 1 parallel path, the MoE has 2 for multiple users in the household.

    Anyone else pissing about with this stuff? I've also got a private ~3M parameter financial model I've been working on over the past year (in design for ~3 years) that is coming to a close in terms of project this year.

    I also previously worked in the ML space in healthcare, focused on CT scan imaging machine learning (you go for a cancer scan, the model reviews the sliced CT images and determines cancerous nodules, lookup Deephealth/Aidence, I no longer work for them).

    Also, if you're looking at pushing the limits of what is possible with your hardware, there is a large discussion on the turbo KV caching released by Google a while back - https://github.com/ggml-org/llama.cpp/discussions/20969 - there are a number of posts with various tests on caching for various GPU's w/ different models
     
    #1 freshage, Jun 4, 2026 at 1:08 PM
    Last edited: Jun 4, 2026 at 4:02 PM
  2. The only part of your post I understood was “Anyone else pissing around with this stuff”. You speak in tongues :D Andy
     
    #2 Android853sp, Jun 4, 2026 at 1:14 PM
    Last edited: Jun 4, 2026 at 3:15 PM
    • Funny Funny x 4
  3. ⎅⍜ ⊬⍜⎍ ⋏⍜⏁ ⌇⌿⟒⏃☍ ⏁⊑⟒ ⌰⏃⋏☌⎍⏃☌⟒ ⍜⎎ ☊⏁⊑⎍⌰⊑⎍?
     
    • Agree Agree x 1
  4. Have you tried asking it about Tiananmen Square yet?
     
Do Not Sell My Personal Information