More

gardnr · 2025-12-10T17:37:33 1765388253

This is a 30B parameter MoE with 3B active parameters and is the successor to their previous 7B omni model. [1]

You can expect this model to have similar performance to the non-omni version. [2]

There aren't many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.

1. https://huggingface.co/Qwen/Qwen2.5-Omni-7B

2. https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct

red2awn · 2025-12-10T19:35:51 1765395351

This is a stack of models:

- 650M Audio Encoder

- 540M Vision Encoder

- 30B-A3B LLM

- 3B-A0.3B Audio LLM

- 80M Transformer/200M ConvNet audio token to waveform

This is a closed source weight update to their Qwen3-Omni model. They had a previous open weight release Qwen/Qwen3-Omni-30B-A3B-Instruct and a closed version Qwen3-Omni-Flash.

You basically can't use this model right now since none of the open source inference framework have the model fully implemented. It works on transformers but it's extremely slow.

olafura · 2025-12-10T18:05:06 1765389906

Looks like it's not open source: https://www.alibabacloud.com/help/en/model-studio/qwen-omni#...

coder543 · 2025-12-10T18:11:30 1765390290

No... that website is not helpful. If you take it at face value, it is claiming that the previous Qwen3-Omni-Flash wasn't open either, but that seems wrong? It is very common for these blog posts to get published before the model weights are uploaded.

red2awn · 2025-12-10T19:37:25 1765395445

The previous -Flash weight is closed source. They do have weights for the original model that is slightly behind in performance https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct

coder543 · 2025-12-10T20:00:19 1765396819

Based on things I had read over the past several months, Qwen3-Flash seemed to just be a weird marketing term for the Qwen3-Omni-30B-A3B series, not a different model. If they are not the same, then that is interesting/confusing.

red2awn · 2025-12-10T20:15:34 1765397734

It is an in-house closed weight model for their own chat platform, mentioned in Section 5 of the original paper: https://arxiv.org/pdf/2509.17765

I've seen it in their online materials too but can't seem to find it now.

gardnr · 2025-12-10T17:56:49 1765389409

I can't find the weights for this new version anywhere. I checked modelscope and huggingface. It looks like they may have extended the context window to 200K+ tokens but I can't find the actual weights.

pythux · 2025-12-10T18:01:12 1765389672

They link to: https://huggingface.co/collections/Qwen/qwen3-omni-68d100a86... from the blog post but it does seem like this redirects to their main space on HF so maybe they didn't yet make the model public?

tensegrist · 2025-12-10T18:22:11 1765390931

> There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.

last i checked (months ago) claude used to do this

andy_ppp · 2025-12-11T09:54:44 1765446884

Haha, you could hear how it’s mind thinks, maybe by putting a lot of reverb on the thinking tokens or some other effect…

plipt · 2025-12-10T19:40:06 1765395606

I dont think the Flash model discussed in the article is 30B

Their benchmark table shows it beating Qwen3-235B-A22B

Does "Flash" in the name of a Qwen model indicate a model-as-a-service and not open weights?

red2awn · 2025-12-10T19:49:36 1765396176

Flash is a closed weight version of https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct (it is 30B but with addtional training on top of the open weight release). They deploy the flash version on Qwen's own chat.

plipt · 2025-12-10T19:59:48 1765396788

Thanks

Was it being closed weight obvious to you from the article? Trying to understand why I was confused. Had not seen the "Flash" designation before

Also 30B models can beat a semi-recent 235B with just some additional training?

red2awn · 2025-12-10T20:19:56 1765397996

They had a Flash variant released alongside the original open weight release. It is also mentioned in Section 5 of the paper: https://arxiv.org/pdf/2509.17765

For the evals it's probably just trained on a lot of the benchmark adjacent datasets compared to the 235B model. Similar thing happened on other model today: https://x.com/NousResearch/status/1998536543565127968 (a 30B model trained specifically to do well in maths get near SOTA scores)

andy_xor_andrew · 2025-12-10T18:37:01 1765391821

> This is a 30B parameter MoE with 3B active parameters

Where are you finding that info? Not saying you're wrong; just saying that I didn't see that specified anywhere in the linked page, or on their HF.

plipt · 2025-12-10T19:36:17 1765395377

The link[1] at the top of their article to HuggingFace goes to some models named Qwen3-Omni-30B-A3B that were last updated in September. None of them have "Flash" in the name.

The benchmark table shows this Flash model beating their Qwen3-235B-A22B. I dont see how that is possible if it is a 30B-A3B model.

I don't see a mention of a parameter count anywhere in the article. Do you? This may not be an open weights model.

This article feels a bit deceptive

1: https://huggingface.co/collections/Qwen/qwen3-omni

gardnr · 2025-12-06T16:24:06 1765038246

I love lightweight distros. QNX had a "free as in beer" distro that fit on a floppy, with Xwindows and modem drivers. After years of wrangling with Slackware CDs, it was pretty wild to boot into a fully functional system from a floppy.

Someone · 2025-12-06T16:55:54 1765040154

> QNX had a "free as in beer" distro that fit on a floppy, with Xwindows and modem drivers.

I don’t think that had the X Windows system. https://web.archive.org/web/19991128112050/http://www.qnx.co... and https://marc.info/?l=freebsd-chat&m=103030933111004 confirm that. It ran the Photon microGUI Windowing System (https://www.qnx.com/developers/docs/6.5.0SP1.update/com.qnx....)

Beijinger · 2025-12-06T19:44:37 1765050277

Somebody has build it: https://membarrier.wordpress.com/2017/04/12/qnx-7-desktop/

ddalex · 2025-12-06T16:46:48 1765039608

I never understood how that QNX desktop didn't pick up instanntly, it was amazing !

Joel_Mckay · 2025-12-06T17:40:45 1765042845

Licensing, and QNX missed a consumer launch window by around 17 years.

Some businesses stick with markets they know, as non-retail customer revenue is less volatile. If you enter the consumer markets, there are always 30k irrational competitors (likely with 1000X the capital) that will go bankrupt trying to undercut the market.

It is a decision all CEO must make eventually. Best of luck =3

"The Rules for Rulers: How All Leaders Stay in Power"

https://www.youtube.com/watch?v=rStL7niR7gs

api · 2025-12-06T18:51:00 1765047060

This also underscores my explanation for the “worse is better” phenomenon: worse is free.

Stuff that is better designed and implemented usually costs money and comes with more restrictive licenses. It’s written by serious professionals later in their careers working full time on the project, and these are people who need to earn a living. Their employers also have to win them in a competitive market for talent. So the result is not and cannot be free (as in beer).

But free stuff spreads faster. It’s low friction. People adopt it because of license concerns, cost, avoiding lock in, etc., and so it wins long term.

Yes I’m kinda dissing the whole free Unix thing here. Unix is actually a minimal lowest common denominator OS with a lot of serious warts that we barely even see anymore because it’s so ubiquitous. We’ve stopped even imagining anything else. There were whole directions in systems research that were abandoned, though aspects live on usually in languages and runtimes like Java, Go, WASM, and the CLR.

Also note that the inverse is not true. I’m not saying that paid is always better. What I’m saying is they worse is free, better was usually paid, but some crap was also paid. But very little better stuff was free.

rzerowan · 2025-12-06T19:34:09 1765049649

There is also the option by well written professional wherer the startergy is to grab as much market share as they can by allowing the proliferation of their product to lockup market/mindshare and rleaget the $ enforcement for later - successfully used by MSWindows for the longest time and Photoshop .

Conversly i remenber Maya or Autodesk used to have a bounty program for whoever would turn in people using unlicensed/cracked versions of their product.Meanwhile Blender (from a commercial past) kept their free nature and have connsistently grown in popularity and quality without any such overtures.

Of course nowadays with Saas everything get segmented into wierd verticals and revenue upsells are across the board with the first hit usually also being free.

Joel_Mckay · 2025-12-06T21:06:30 1765055190

As a business, dealing with Microsoft and Oracle is not a clean transactional sale.

They turned into legal-service-firms along the way, and stopped real software development/risk at some point in 2004.

These firms have been selling the same product for decades. Yet once they get their hooks into a business, few survive the incurred variable costs of the 3000lb mosquito. =3

Joel_Mckay · 2025-12-06T20:51:25 1765054285

The only reason FOSS sometime works was because the replication cost is almost $0.

In *nix, most users had a rational self-interest to improve the platform. "All software is terrible, but some of it is useful." =3

RachelF · 2025-12-07T18:57:34 1765133854

As someone who used QNX back then, they didn't target end-users but embedded and real-time users.

They were expensive too. You had to pay for each device driver you used.

knowitnone3 · 2025-12-06T20:05:49 1765051549

because it's not free and their aim was at developers and the embedded space. How many people have even heard of QNX?

anyfoo · 2025-12-06T16:30:50 1765038650

That famous QNX boot disk was the first thing I thought of when reading the title as well.

taylodl · 2025-12-06T16:33:35 1765038815

Me too! And the GUI was only a 40KB distribution and was waaaaaay better than Windows 3.0!

jacquesm · 2025-12-06T19:11:33 1765048293

And incredibly responsive compared to the operatings systems of even today. Imagine that: 30 years of progress to end up behind where we were. Human input should always run at the highest priority in the system, not the lowest.

M95D · 2025-12-07T21:20:07 1765142407

That ended with Win9x. It was the last OS where the mouse and keyboard inputs were processed as hardware interrupts.

knowitnone3 · 2025-12-06T20:00:35 1765051235

yeah but what can you do with free QNX? With tinycore, you can install many packages. What packages exist for QNX?

lproven · 2025-12-08T12:22:34 1765196554

> yeah but what can you do with free QNX?

« QNX DEMO disk

Extending possibilities and adding undocumented features »

http://qnx.puslapiai.lt/qnxdemo/qnx_demo_disk.htm

gardnr · 2025-12-02T06:24:23 1764656663

Location: NZ / USA

Remote: Yes

Willing to relocate: Yes

Technologies: Mainly web SaaS (TypeScript/Python) with a focus on AI, embeddings, LLMs and Voice Agents.

Résumé/CV: https://docs.google.com/document/d/1mcK18EVVkjQ-KadXu0VW6Ovx...

Email: gardner@bickford.ai

gardnr · 2025-11-30T19:37:35 1764531455

Full length: https://www.youtube.com/watch?v=WXuK6gekU1Y

They do a great job capturing the "Move 37" moment: https://youtu.be/WXuK6gekU1Y?t=2993

gardnr · 2025-11-29T08:41:53 1764405713

Telegram, which ostensibly claimed to provide e2e but really only did in very specific circumstances? My right wing uncle is still bitter about that. Then there's the rolling over the founder did after getting pulled up by Interpol.

I'm not sure why anyone would trust Telegram.

ufmace · 2025-11-29T15:13:01 1764429181

Exactly what "rolling over" would that be?

Maybe you don't believe Durov's statement[0] about it. But is there any actual evidence anywhere that they've ever violated the secrecy of non-e2e private groups or messages for anyone? I've yet to find any.

[0] https://t.me/durov/342

vitorgrs · 2025-12-01T08:22:25 1764577345

Oh, I agree. it's just, they mentioned things like Twitter DM, Messenger and forgot Telegram...

dymk · 2025-11-29T14:51:59 1764427919

That wasn’t the original question though. Twitter and Messenger are also untrustworthy. Telegram’s message export is very good compared to all the other options.

gardnr · 2025-11-27T02:42:27 1764211347

He says the current models generalize dramatically worse than people: https://youtu.be/aR20FWCCjAs?t=1501

Then, he starts to talk about the other ideas but his lawyers / investors prevent him from going into detail: https://youtu.be/aR20FWCCjAs?t=1939

The worrisome thing is that he openly talks about whether to release AGI to the public. So, there could be a world in which some superpower has access to wildly different tech than the public.

To take Hinton's analogy of AGI to extraterrestrial intelligence, this would be akin to a government having made contact but withholding the discovery and the technology from the public: https://youtu.be/e1Hf-o1SzL4?t=30

It's a wild time to be alive.

laichzeit0 · 2025-11-27T03:55:37 1764215737

It’s also weird to think that if there is extraterrestrial contact, it will most definitely happen in the specific land mass known as the United States and only the US government will be collecting said technology and hiding it. Out of the entire planet, contact is possible only in the USA.

gardnr · 2025-11-27T13:04:32 1764248672

I'm not sure if you're jabbing at the concept of American supremacy, or Hinton's idea, or my position. I don't live in the USA right now, but I am happy to participate in conversation. That's why I am here.

Can you unpack your ideas a bit more?

gardnr · 2025-11-27T02:30:44 1764210644

I thought about the huge pile of hard drives in Utah this morning. The TLAs in the USA have a metric shit ton of data that _should_ not be used but _could_ be used.

Even still, we need evolutions in model architecture to get to the next level. Data is not enough.

catigula · 2025-11-27T08:06:42 1764230802

They are going to be used.

octoberfranklin · 2025-11-27T03:41:17 1764214877

A lot of what's on that pile of hard drives is ciphertext waiting for cryptographically relevant quantum computing to arrive.

LLMs can't do jack shit with ciphertext (sans key).

gardnr · 2025-11-27T02:22:26 1764210146

The gains were on benchmarks. Ilya describes why this is a red herring here: https://youtu.be/aR20FWCCjAs?t=286

Libidinalecon · 2025-11-27T03:01:46 1764212506

Gemini 3 is a huge jump. I can't imagine how anyone who uses the models all the time wouldn't feel this.

gardnr · 2025-11-29T08:44:15 1764405855

What does it do that Opus doesn't do?

mountainriver · 2025-11-27T03:52:48 1764215568

I like Ilya's points but its also clearly progress, and we can't just write it off because we like another narrative

gardnr · 2025-11-27T02:17:44 1764209864

This is an AI generated article based on real interviews.

Watch the original Sutskever interview: https://www.youtube.com/watch?v=aR20FWCCjAs

And LeCun: https://www.youtube.com/watch?v=4__gg83s_Do

tim333 · 2025-11-27T10:32:59 1764239579

Not to say you are wrong but how do you know it's AI generated rather than written by Sorca Marian? To me phrases like "Models look god-tier on paper" look more human than AI as a) "god-tier" never came up in the interview and b) it's brief and doesn't waffle on.

g-mork · 2025-11-27T02:38:50 1764211130

If it was AI-generated I had no difficulty with it, certainly on par with typical surface level journalist summaries, and vastly better than losing 2 hours of my life to watching some video interviews.. :) AI as we know it may not be real intelligence but it certainly has valid uses

gardnr · 2025-11-27T02:48:02 1764211682

I get a lot from seeing the person talk vs reading a summary. I have gone back and watched a lot of interviews and talks with Ilya. In hindsight, it is easy to hear the future ideas in his words at the time.

That said, I use AI summaries for a lot of stuff that I don't really care about. For me, this topic is important enough to spend two hours of my life on, soaking up every detail.

As for being on par with typical surface level journalism. I think we might be further into the dead internet than most people realize: https://en.wikipedia.org/wiki/Dead_Internet_theory

ghurtado · 2025-11-27T18:19:04 1764267544

> I get a lot from seeing the person talk vs reading a summary

And some people simply have the opposite preference. There's lots of situations where sound is simply not an option. Some people are hearing impaired. Some people are visually impaired, and will definitely not get much from watching the person speak. Some ESL people have a hard time with spoken English. Even some native English speakers have a hard time with certain accents. Some people only have 5 minutes to spare instead of 50.

All of those problems are solved by the written word, which is why it hasn't gone away yet, even though we have amazing video capabilities.

You can have a preference without randomly label everything you don't like as AI slop.

gardnr · 2025-11-19T22:00:34 1763589634

How does it compare and contrast to DNSCrypt?

https://github.com/DNSCrypt