I recently got into creating avatars for VR and have used AI to learn Unity/Blender so ridiculously fast, like just a couple weeks I've been at it now. All the major models can answer basically any question. I can paste in screenshots of what I'm working on and questions and it will tell me step by step what to do. I'll ask it what particular settings mean, there are so many settings in 3d programs; it'll explain them all and suggest defaults. You can literally give Gemini UV maps and it'll generate textures for you, or this for 3d models. It feels like the jump before/after stack overflow.
The game Myst is all about this magical writing script that allowed people to write entire worlds in books. That's where it feels like this is all going. Unity/Blender/Photoshop/etc.. is ripe for putting a LLM over the entire UI and exposing the APIs to it.
ForTheKidz 50 days ago [-]
> The game Myst is all about this magical writing script that allowed people to write entire worlds in books. That's where it feels like this is all going. Unity/Blender/Photoshop/etc.. is ripe for putting a LLM over the entire UI and exposing the APIs to it.
This is probably the first pitch for using AI as leverage that's actually connected with me. I don't want to write my own movie (sounds fucking miserable), but I do want to watch yours!
iaw 50 days ago [-]
I have this system 80% done for novels on my machine at home.
It is terrifyingly good at writing. I expected Freshmen college level but it's actually close to professional in terms of prose.
The plan is maybe transition into children's books then children shows made with AI catered to a particular child at a particular phase of development (Bluey talks to your kid about making sure to pick up their toys)
thisisnotauser 50 days ago [-]
I think there's a big question in there about AI that breaks a lot of my preexisting worldviews about how economics works: if anyone can do this at home, who are you going to sell it to?
Maybe today only a few people can do this, but five years from now? Ten? What sucker would pay for any TV shows or books or video games or anything if there's a comfy UI workflow or whatever I can download for free to make my own?
fixprix 50 days ago [-]
It breaks economics in a good way. Less resources spent on all kinds of media and other things is deflationary. Prices go down, a single person can provide for a family working less hours.
yfw 50 days ago [-]
How is this good in any way for the creative workers? Do you think there's a sustainable source of innovative and interesting experiences being generated if no people wrote anymore?
fixprix 49 days ago [-]
It's good for everybody. We have to work less to survive. Everyone is more productive at whatever they do. The value is in the ideas not the medium. You might write a book and I'll take it and use AI to turn it into audio, or a tv series, or a movie, or a video game.
The bottleneck is no longer on labor to turn ideas into reality, the bottleneck is imagination itself. It's incredible. The cost to produce/consume going down along with many other facets of the economy translates into deflation.
If you make less money, or work less hours, or only have a single person in your family work - that's ok because money will go further. That's the whole idea behind Star Trek, the first step though was intelligent computers, automation and robots. Harnessed in a way that doesn't backfire on us.
yfw 49 days ago [-]
We are all supposed to have less work with all the innovation and automation but look at us now with all the households where two adults are required to have jobs.
fixprix 48 days ago [-]
We produce/consume a lot of things like media and other goods/services that we don't really need. When productivity is channeled into low value things, we all end up having to work more.
_carbyau_ 49 days ago [-]
The vision is nice. I don't see it playing out practically.
It's the human competition. Every human in the world is competing - on some level - with every other human.
Your described widespread "benefits" to society just change the playing field all humans are competing on. The money will go from that to something else - to pick something topical say, housing.
fixprix 48 days ago [-]
Competition and the value of a dollar are not related. If a dollar can go far then that's great. You can compete in business, or sports, or video games whatever. That's great.
_carbyau_ 46 days ago [-]
> Competition and the value of a dollar are not related.
Precisely. Being able to more easily produce and consume just means more production and consumption. The "rat race" competition will continue. No one will work less.
renerick 49 days ago [-]
Ideas are worthless. There are billions of ideas born and dying every day. It is the execution that gives idea value, it is quality of the execution that determines amount of value
fixprix 48 days ago [-]
And AI/robotics is what is turbo charging execution.
robotresearcher 49 days ago [-]
You're correct but all that is true on average. There will be specific people - maybe a large number of them - for whom this is extremely painful. We can celebrate the former but let's not sweep the latter under the rug.
numpad0 49 days ago [-]
> the bottleneck is imagination itself. It's incredible.
You don't understand what that means. That means your soul's worth is measured with more precision and accuracy. That conjoined to free market economy implicates that individuals and groups of people producing less can be more openly considered to be objectively less human.
I don't personally care, but that's not... It seem to me that vast majority of people around here already don't exactly like what the Internet had always rewarded, nor how fast it's evolving, nor where it's headed. I think that accelerating that only accelerates that, and I suspect it might not exactly be what you would reflect on positively later.
treyd 49 days ago [-]
This would be true if it was doing it with essentials like food, healthcare, housing, transit, but it's not.
CamperBob2 50 days ago [-]
What sucker would pay for any TV shows or books or video games or anything if there's a comfy UI workflow or whatever I can download for free to make my own?
I think it's about time the industry faced that risk. They have it coming in spades.
For example, LOST wouldn't have been such a galactic waste of time if I could have asked an AI to rewrite the last half of the series. Current-generation AI is almost sufficient to do a better job than the actual writers, as far as the screenplay itself is concerned, and eventually the technology will be able to render what it writes.
Call it... severance.
numpad0 50 days ago [-]
Only few are both capable and willing to take on creative tasks, with AIs or not. Boring people cannot form strong enough cohesive thoughts that can drive an AI, even if AI output itself were not as boring as they are.
YurgenJurgensen 49 days ago [-]
The ‘professional level’ prose to which you refer:
“ABSOLUTE PRIORITY: TOTAL, COMPLETE, AND ABSOLUTE QUANTUM TOTAL ULTIMATE BEYOND INFINITY QUANTUM SUPREME LEGAL AND FINANCIAL NUCLEAR ACCOUNTABILITY”
Even if AI prose weren’t shockingly dull, these models all go completely insane long before they reach novel length. Anthropic are doing a good job embarrassing themselves at an easy bug-catching game for barely-literate 8-year olds as we speak, and the model’s grip on reality is basically gone at this point, even with a second LLM trying to keep it on track. And even before they get to the ‘insanity’ stage, their writing inevitably experiences regression towards the average of all writing styles regardless of the prompt, so there’s not much ‘prompt engineering’ you can do to fix this.
Ancalagon 49 days ago [-]
This has not been my experience. Which models are you using? The AI's all seem to lose the plot eventually.
yfw 49 days ago [-]
The value of art is that it's a human creation and a product of human expression. The movie you generate from AI is at best content.
sinzin91 50 days ago [-]
You should check out Blender MCP, which allows you to connect Claude Desktop/Cursor/etc to Blender as a tool. Still early days from my experiments but shows where it could go https://github.com/ahujasid/blender-mcp
dr_kiszonka 48 days ago [-]
This looks great! Do you think you might add an option to use the model linked here instead of Hyper3D?
mclau156 50 days ago [-]
I have never seen knowledge to be the limiting factor in success in the 3D world, its usually lots of dedicated time to model, rig, and animate
iamjackg 50 days ago [-]
It's often the limiting factor to getting started, though. Idiosyncratic interfaces and control methods make it really tedious to start learning from scratch.
spookie 50 days ago [-]
I don't think they are idiosyncratic. They are built for purpose, one simply lacks what to look for.
Same for programming really.
I also think that using AI would only lengthen the learning period. It will get some kind of results faster, though.
spookie 50 days ago [-]
If you need time dedicated to it, knowledge is the limiting factor.
anonzzzies 50 days ago [-]
You tried sharing your screen with Gemini intead of screenshots? I found it sometimes is really brilliant and sometimes terrible. It's mostly a win really.
fixprix 49 days ago [-]
I just tried it for the first time and it was a pretty cool experience. Will definitely be using this more. Thanks for the tip!
baq 50 days ago [-]
Look up blender and unity MCP videos. It’s working today.
fixprix 50 days ago [-]
Watching a video on it now, thanks!
tempaccount420 50 days ago [-]
> Unity/Blender/Photoshop/etc.. is ripe for putting a LLM over the entire UI and exposing the APIs to it.
This is what Windows Copilot should have been!
fixprix 49 days ago [-]
I'm sure they're working on it. This MCP stuff is early days. Even I am just finding out about it's integration into Blender and Unity in this thread.
tough 49 days ago [-]
There's already MCP's for both blender and unity and figma already
sruc 50 days ago [-]
Nice model, but strange license. You are not allowed to use it in EU, UK, and South Korea.
“Territory” shall mean the worldwide territory, excluding the territory of the European Union, United Kingdom and South Korea.
You agree not to use Tencent Hunyuan 3D 2.0 or Model Derivatives:
1. Outside the Territory;
johaugum 50 days ago [-]
Meta’s Llama models (and likely many others') have similar restrictions.
Since they don’t fully comply with EU AI regulations, Meta preemptively disallows their use in those regions to avoid legal complications:
“With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models”
This is merely a “we don't take responsibility if this somehow violates EU rules around AI”, it's not something they can enforce in any way.
But even as such a strategy, I don't think that would hold if the Commission decided to fine Tencent for releasing that in case it violated the regulation.
IMHO it's just the lawyers doing something to please the boss who asked them to “solve the problem” (which they can't, really).
ForTheKidz 50 days ago [-]
Probably for domestic protection more than face value. Western licenses certainly have similar clauses to protect against liability for sanction violations. It's not like they can actually do much to prevent the EU from gaining from it.
I am impressed, it runs very fast. Far faster than the non-turbo version. But the primary time is being spent on the texture generation and not on the model generation. As far as I can understand this speeds up the model generation and not the texture generation. But impressive nonetheless.
I also took a head shot of my kid and ran it through https://www.adobe.com/express/feature/ai/image/remove-backgr... and cropped the image and resized it to 1024x1024 and it spit out a 3d model with texture of my kid. There are still some small artifacts, but I am impressed. It works very well with the assets/example_images. Very usable.
Good work Hunyuan!
Y_Y 50 days ago [-]
How are they extracting value here? Is this just space-race-4-turbo propagandising?
I see plenty of GitHub sites that are barely more than advertising, where some company tries to foss-wash their crapware, or tries to build a little text-colouring library that burrows into big projects as a sleeper dependency. But this isn't that.
What's the long game for these companies?
yowlingcat 50 days ago [-]
There's an old Joel Spolsky post that's evergreen about this strategy -- "commoditize your complement" [1]. I think it's done for the same reason Meta has made llama reasonably open -- making it open ensures that a proprietary monopoly over AI doesn't threaten your business model, which is noteworthy when your business model might include aggregating tons of UGC and monetizing engagement over it. True, you may not be able to run the only "walled garden" around it anymore, but at least someone else can't raid your walled garden to make a new one that you can't resell anymore. That's the simplest strategic rationale I could give for it, but I can imagine deeper layers going beyond that.
Maybe it was unclear, but that's what I'm asking too.
awongh 50 days ago [-]
What's the best img2mesh model out there right now, regardless of processing requirements?
Are any of them better or worse with mesh cleanliness? Thinking in terms of 3d printing....
MITSardine 50 days ago [-]
From what I could tell of the Git repo (2min skimming), their model is generating a point cloud, and they're then applying non-ML meshing methods on that (marching cubes) to generate a surface mesh. So you could plug any point-cloud-to-surface-mesh software in there.
I wondered initially how they managed to produce valid meshes robustly, but the answer is not to produce a mesh, which I think is wise!
quitit 50 days ago [-]
Running my usual img2mesh tests on this.
1. It does a pretty good job, definitely a steady improvement
2. The demos are quite generous versus my own testing, however this type of cherry-picking isn't unusual.
3. The mesh is reasonably clean. There are still some areas of total mayhem (but these are easy to fix in clary modelling software.)
leshokunin 50 days ago [-]
Can we see meshes, exports in common apps as examples?
This looks better than the other one on the front page rn
llm_nerd 50 days ago [-]
Generate some of your own meshes and drop them in Blender.
The meshes are very face-rich, and unfortunately do not reduce well in any current tool [1]. A skilled Blender user can quickly generate better meshes with a small fraction of the vertices. However if you don't care about that, or if you're just using it for brainstorming starter models it can be super useful.
[1] A massive improvement in the space will be AI or algorithmic tools which can decimate models better than the current crop. Often thousands of vertices can be reduced to a fraction with no appreciable impact in quality, but current tools can't do this.
kilpikaarna 48 days ago [-]
There's Quad Remesher (integrated in C4D and ZBrush as ZRemesher). It's proprietary but quite affordable ($109 for a perpetual commercial license or $15/month -- no, not affiliated).
No AI, just clever algorithms. I'm sure there are people trying to train a model to do the same thing but jankier and more unpredictable, though.
llm_nerd 47 days ago [-]
It's an interesting project and seems to work superbly on human created topographies, but in some testing with outputs of Hunyuan3D v2, it is definitely a miss. It is massively destructive to the model and seems to miss extremely obvious optimizations of the mesh while destroying the fidelity of the model even at very low reduction settings.
Something about the way this project generates models does not mesh, har har, with the algorithms of Quad Remesher.
dvrp 50 days ago [-]
Agree. That's why I posted it; I was surprised people were sleeping on this. But it's because they posted something yesterday and so the link dedup logic ignored this. This is why I linked to the commit instead.
There are meshes examples on the Github. I'll toy around with it.
I think the link should be updated to this since it's currently just pointing to a git commit.
dvrp 50 days ago [-]
Reason for that is because of the dedup filter was thinking that this release is the same as the one that happened yesterday. Besides, the Flash release is only one of many.
50 days ago [-]
boppo1 50 days ago [-]
Can it run on a 4080 but slower, or is the vram a limitation?
llm_nerd 50 days ago [-]
It can run on a 4080 if you divide and conquer. I just ran a set on my 3060 (12GB), although I have my own script which does each step separately as each stage uses from 6 - 12GB of VRAM.
-loads the diffusion model to go from text to an image and then generate a varied series of images based upon my text. One of the most powerful features of this tool, in my opinion, is text to mesh, and to do this it uses a variant of Stable Diffusion to create 2D images as a starting point, then returning to the image to mesh pipeline. If you already have an image this part obviously isn't necessary.
-frees the diffusion model from memory.
Then for each image I-
-load the image to mesh model, which takes approximately 12GB of VRAM. Generate a mesh
-free the image to mesh model
-load the mesh + image to textured mesh model. Texture the mesh
-free the mesh + image to textured mesh model
It adds a lot of I/O between each stage, but with super fast SSDs it just isn't a big problem.
llm_nerd 50 days ago [-]
Just as one humorous aside, if you use the text to mesh pipeline, as mentioned the first stage is simply a call to a presumably fine-tuned variant of stable diffusion with your text and the following prompts (translated from Simplified Chinese)-
Positive: "White background, 3D style, best quality"
Negative: "text, closeup, cropped, out of frame, worst quality, low quality, JPEG artifacts, PGLY, duplicate, morbid, mutilated, extra fingers, mutated hands, bad hands, bad face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck"
Thought that was funny.
dvrp 50 days ago [-]
They don't mention that and I don't have one — can you try for yourself and let us know? I think you can get it from Huggingface or GH @ https://github.com/Tencent/Hunyuan3D-2
fancyfredbot 50 days ago [-]
They mention "It takes 6 GB VRAM for shape generation and 24.5 GB for shape and texture generation in total."
So based on this your 4080 can do shape but not texture generation.
boppo1 50 days ago [-]
Nnice, that's all i needed anyway.
thot_experiment 50 days ago [-]
almost certainly, i haven't tried the most recent models but i have used hy3d2 and hy3d2-fast a lot and they're quite light to inference. You're gonna spend more time decoding the latent than you will inferencing. Takes about 6gb vram on my machine, I can't imagine these will be heavier.
lwansbrough 50 days ago [-]
How long before we start getting these rigged using AI too? I’ve seen a few of these 3D models so far but none that do rigging.
This is what I'm looking forward to the most, there's a lot of potential for virtual reality with these models.
debbiedowner 50 days ago [-]
Has anyone tried it on a 3090?
coolius 50 days ago [-]
has anyone tried to run this on apple silicon yet?
postalrat 50 days ago [-]
That would be revolutionary.
amelius 50 days ago [-]
I don't understand why it is necessary to make it this fast.
Philpax 50 days ago [-]
It helps with iteration - you can try out different concepts and variations quickly without having to wait, especially as you refine what you want and your understanding of what it's capable of.
Also, in general, why not?
amelius 50 days ago [-]
> Also, in general, why not?
There are various reasons:
- Premature optimization will take away flexibility, and will thus affect your ability to change the code later.
- If you add features later that will affect performance, then since the users are used to the high performance, they might think your code is slow.
- There are always a thousands things to work on, so why spend effort on things that users, at this point, don't care much about?
TeMPOraL 50 days ago [-]
Being this fast is not a "premature optimization", it's a qualitatively different product category. ~immediate feedback vs. long wait time enables entirely different kinds of working.
Also:
> since the users are used to the high performance, they might think your code is slow.
I wouldn't worry about it in general - almost all software is ridiculously slow for the little it can do, and for the performance of machines it runs on, and it still gets used. Users have little choice anyway.
In this specific case, if speed is makes it into a different product, then losing that speed makes the new thing... a different product.
> There are always a thousands things to work on, so why spend effort on things that users, at this point, don't care much about?
It's R&D work, and it's not like they're selling it. Optimizing for speed and low resource usage is actually a good way to stop the big players from building moats around the technology, and to me, that seems like a big win for humanity.
llm_nerd 50 days ago [-]
They released the original "slow" version several months ago. After understanding the problem space better they can now release the much, much faster variant. That is the complete opposite of premature optimization.
Yes, of course people care about performance. Generating the mesh on a 3060 took 110+ seconds before, and now is about 1 second. And on early tests the quality is largely the same. I'd rather wait 1 second than 110 seconds, wouldn't you? And obviously this has an enormous impact on the financials of operating this as a service.
andybak 50 days ago [-]
> users, at this point, don't care much about?
What makes you think this is true?
bufferoverflow 50 days ago [-]
Fast is always better than slow, if the quality isn't worse.
Rendered at 06:28:39 GMT+0000 (UTC) with Wasmer Edge.
The game Myst is all about this magical writing script that allowed people to write entire worlds in books. That's where it feels like this is all going. Unity/Blender/Photoshop/etc.. is ripe for putting a LLM over the entire UI and exposing the APIs to it.
This is probably the first pitch for using AI as leverage that's actually connected with me. I don't want to write my own movie (sounds fucking miserable), but I do want to watch yours!
It is terrifyingly good at writing. I expected Freshmen college level but it's actually close to professional in terms of prose.
The plan is maybe transition into children's books then children shows made with AI catered to a particular child at a particular phase of development (Bluey talks to your kid about making sure to pick up their toys)
Maybe today only a few people can do this, but five years from now? Ten? What sucker would pay for any TV shows or books or video games or anything if there's a comfy UI workflow or whatever I can download for free to make my own?
The bottleneck is no longer on labor to turn ideas into reality, the bottleneck is imagination itself. It's incredible. The cost to produce/consume going down along with many other facets of the economy translates into deflation.
If you make less money, or work less hours, or only have a single person in your family work - that's ok because money will go further. That's the whole idea behind Star Trek, the first step though was intelligent computers, automation and robots. Harnessed in a way that doesn't backfire on us.
It's the human competition. Every human in the world is competing - on some level - with every other human.
Your described widespread "benefits" to society just change the playing field all humans are competing on. The money will go from that to something else - to pick something topical say, housing.
Precisely. Being able to more easily produce and consume just means more production and consumption. The "rat race" competition will continue. No one will work less.
You don't understand what that means. That means your soul's worth is measured with more precision and accuracy. That conjoined to free market economy implicates that individuals and groups of people producing less can be more openly considered to be objectively less human.
I don't personally care, but that's not... It seem to me that vast majority of people around here already don't exactly like what the Internet had always rewarded, nor how fast it's evolving, nor where it's headed. I think that accelerating that only accelerates that, and I suspect it might not exactly be what you would reflect on positively later.
I think it's about time the industry faced that risk. They have it coming in spades.
For example, LOST wouldn't have been such a galactic waste of time if I could have asked an AI to rewrite the last half of the series. Current-generation AI is almost sufficient to do a better job than the actual writers, as far as the screenplay itself is concerned, and eventually the technology will be able to render what it writes.
Call it... severance.
Even if AI prose weren’t shockingly dull, these models all go completely insane long before they reach novel length. Anthropic are doing a good job embarrassing themselves at an easy bug-catching game for barely-literate 8-year olds as we speak, and the model’s grip on reality is basically gone at this point, even with a second LLM trying to keep it on track. And even before they get to the ‘insanity’ stage, their writing inevitably experiences regression towards the average of all writing styles regardless of the prompt, so there’s not much ‘prompt engineering’ you can do to fix this.
I also think that using AI would only lengthen the learning period. It will get some kind of results faster, though.
This is what Windows Copilot should have been!
“Territory” shall mean the worldwide territory, excluding the territory of the European Union, United Kingdom and South Korea.
You agree not to use Tencent Hunyuan 3D 2.0 or Model Derivatives: 1. Outside the Territory;
Since they don’t fully comply with EU AI regulations, Meta preemptively disallows their use in those regions to avoid legal complications:
“With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models”
https://github.com/meta-llama/llama-models/blob/main/models/...
But even as such a strategy, I don't think that would hold if the Commission decided to fine Tencent for releasing that in case it violated the regulation.
IMHO it's just the lawyers doing something to please the boss who asked them to “solve the problem” (which they can't, really).
North Korea? Maybe. Uk? Who gives a shit
I am impressed, it runs very fast. Far faster than the non-turbo version. But the primary time is being spent on the texture generation and not on the model generation. As far as I can understand this speeds up the model generation and not the texture generation. But impressive nonetheless.
I also took a head shot of my kid and ran it through https://www.adobe.com/express/feature/ai/image/remove-backgr... and cropped the image and resized it to 1024x1024 and it spit out a 3d model with texture of my kid. There are still some small artifacts, but I am impressed. It works very well with the assets/example_images. Very usable.
Good work Hunyuan!
I see plenty of GitHub sites that are barely more than advertising, where some company tries to foss-wash their crapware, or tries to build a little text-colouring library that burrows into big projects as a sleeper dependency. But this isn't that.
What's the long game for these companies?
https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/
Are any of them better or worse with mesh cleanliness? Thinking in terms of 3d printing....
I wondered initially how they managed to produce valid meshes robustly, but the answer is not to produce a mesh, which I think is wise!
1. It does a pretty good job, definitely a steady improvement
2. The demos are quite generous versus my own testing, however this type of cherry-picking isn't unusual.
3. The mesh is reasonably clean. There are still some areas of total mayhem (but these are easy to fix in clary modelling software.)
This looks better than the other one on the front page rn
https://huggingface.co/spaces/tencent/Hunyuan3D-2
The meshes are very face-rich, and unfortunately do not reduce well in any current tool [1]. A skilled Blender user can quickly generate better meshes with a small fraction of the vertices. However if you don't care about that, or if you're just using it for brainstorming starter models it can be super useful.
[1] A massive improvement in the space will be AI or algorithmic tools which can decimate models better than the current crop. Often thousands of vertices can be reduced to a fraction with no appreciable impact in quality, but current tools can't do this.
No AI, just clever algorithms. I'm sure there are people trying to train a model to do the same thing but jankier and more unpredictable, though.
Something about the way this project generates models does not mesh, har har, with the algorithms of Quad Remesher.
There are meshes examples on the Github. I'll toy around with it.
-loads the diffusion model to go from text to an image and then generate a varied series of images based upon my text. One of the most powerful features of this tool, in my opinion, is text to mesh, and to do this it uses a variant of Stable Diffusion to create 2D images as a starting point, then returning to the image to mesh pipeline. If you already have an image this part obviously isn't necessary.
-frees the diffusion model from memory.
Then for each image I-
-load the image to mesh model, which takes approximately 12GB of VRAM. Generate a mesh
-free the image to mesh model
-load the mesh + image to textured mesh model. Texture the mesh
-free the mesh + image to textured mesh model
It adds a lot of I/O between each stage, but with super fast SSDs it just isn't a big problem.
Positive: "White background, 3D style, best quality"
Negative: "text, closeup, cropped, out of frame, worst quality, low quality, JPEG artifacts, PGLY, duplicate, morbid, mutilated, extra fingers, mutated hands, bad hands, bad face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck"
Thought that was funny.
So based on this your 4080 can do shape but not texture generation.
Also, in general, why not?
There are various reasons:
- Premature optimization will take away flexibility, and will thus affect your ability to change the code later.
- If you add features later that will affect performance, then since the users are used to the high performance, they might think your code is slow.
- There are always a thousands things to work on, so why spend effort on things that users, at this point, don't care much about?
Also:
> since the users are used to the high performance, they might think your code is slow.
I wouldn't worry about it in general - almost all software is ridiculously slow for the little it can do, and for the performance of machines it runs on, and it still gets used. Users have little choice anyway.
In this specific case, if speed is makes it into a different product, then losing that speed makes the new thing... a different product.
> There are always a thousands things to work on, so why spend effort on things that users, at this point, don't care much about?
It's R&D work, and it's not like they're selling it. Optimizing for speed and low resource usage is actually a good way to stop the big players from building moats around the technology, and to me, that seems like a big win for humanity.
Yes, of course people care about performance. Generating the mesh on a 3060 took 110+ seconds before, and now is about 1 second. And on early tests the quality is largely the same. I'd rather wait 1 second than 110 seconds, wouldn't you? And obviously this has an enormous impact on the financials of operating this as a service.
What makes you think this is true?