[RFC] Network render users, assemble!

Feature requests, bug reports and related discussion
User avatar
Heavily Tessellated
Posts: 108
Joined: Thu Aug 10, 2006 4:20 pm
Location: Huh?

[RFC] Network render users, assemble!

Post by Heavily Tessellated » Tue Sep 11, 2007 5:05 pm

Beating up the networking code (for fun!)
---

In a nutshell, what we have is: The master giveth, the master taketh back.

I don't want to rehash all the threads on the subject, but instead, think about this:
What would be the least amount of effort for nik and the most beneficial for everyone?

I was pondering, would the ability to let slaves run an external command at certain specific times be useful? Like when initially launching as a slave, and again when it's ready to dump it's buffer to the master? With the latter, we'd probably want some code to be able to announce "incoming!" to the master as well.

Not much coding on his part, but that way, each of us can run down the wire with our own little proggies and scripts... a simple sanity checker at the slave start, so we don't end up with double-exposed images like that one CRC got. A handy texture image fetch and even perhaps xml parser to fix paths when some slaves are Windows and some linux/BSD/OSX+wine, or when machines aren't on the same wire and can't use shared storage. Query available memory on the server and decide to spawn or not. And, zlib or otherwise, we could smush IGIs however we see fit, check network congestion, whatever. All externalized from Indigo.

I'm not talking a 18-weeks-of-planning-and-zero-writing-code full network API, just a quick & dirty method to gain a wee bit of granularity. Call it a kludge, it would still benefit all types of network rendering; from full render queue management projects to just me tunnelling :7777 via SSH to my one remote master when I logout for the night.
-------

I'm not trying (that hard) to sell my idea... some of the others (zsouthboy's) are better. But besides, I nor him can make nik do a damn thing, much less write and test code changes. I'm just trying to get a forum thinktank rolling with one basic idea - without seeing the source, what do you think is the best way we can get some control over network renders, with the least amount of effort and minimal impact on Indigo code?

Well, anyway, thanks for reading.

User avatar
zsouthboy
Posts: 1395
Joined: Fri Oct 13, 2006 5:12 am

Post by zsouthboy » Wed Sep 12, 2007 9:44 am

I'm with you on this one.

I'm in the middle of writing some Python to handle Indigo across my machines - and it's not going well.
(I'm learning Python as I go - neat language, nice and strong like C/C++)

My problem right now is terminating the slave processes. Bleh. So ugly.

rsz
Posts: 1
Joined: Wed Sep 12, 2007 11:23 am

Post by rsz » Wed Sep 12, 2007 11:42 am

With other renderers, my approach to network rendering is a slave/master MPI program that farms out frame-parallel rendering via Sun Grid Engine (SGE) on large (500-1000 core) numa and smp systems. I only wish I could build Indigo from source for my obscure 64bit big-iron platforms, or as a first cut run a linux-compiled 32bit version on them.

If I could make some donation to push for a linux or cross-compiled Indigo, let me know. Cycles waste as we speak! Sorry to digress. Later,

Remik

oogsnoepje
Posts: 35
Joined: Sat Aug 18, 2007 10:20 am

Post by oogsnoepje » Wed Sep 12, 2007 4:43 pm

zsouthboy wrote:My problem right now is terminating the slave processes. Bleh. So ugly.
I'm working on a Gui for Indigo (and other renderers) and have to agree with this.

It's also a bit of a mundane task to shut down Indigo. I'd like to make my interface as user friendly as possible, but for that I need Indigo be able to shut down immediately when the user requests so, because otherwise he'll have to wait until Indigo has closed down to start a new renderjob.

That's why I currently just shoot Indigo out of the sky by killing it. Not nice, but it feels like a hack to add timers for Indigo to shut down or to use multiple instances.
(well, with a tasty use of named pipes, background processes, scraping and some regex-magic to scrape the output it's already kind of a hack, but still... :roll: )

Ono, would you happen to have any plans to do an API so we program writers can support Indigo the right way?

User avatar
Kram1032
Posts: 6649
Joined: Tue Jan 23, 2007 3:55 am
Location: Austria near Vienna

Post by Kram1032 » Wed Sep 12, 2007 11:36 pm

uhm... I don't want to shut down Indigo from one sec to the other one, but, to let it save the last state of the pic, and then, shut it from one sec to the other :) - igis need quite long to save, I think... and hires, too ;)

User avatar
zsouthboy
Posts: 1395
Joined: Fri Oct 13, 2006 5:12 am

Post by zsouthboy » Thu Sep 13, 2007 1:13 am

Kram1032 wrote:uhm... I don't want to shut down Indigo from one sec to the other one, but, to let it save the last state of the pic, and then, shut it from one sec to the other :) - igis need quite long to save, I think... and hires, too ;)
For network renderings this isn't an issue. Kill a slave in the middle of it uploading a frame -> master process goes "Slave connection dropped", no harm no foul

User avatar
Kram1032
Posts: 6649
Joined: Tue Jan 23, 2007 3:55 am
Location: Austria near Vienna

Post by Kram1032 » Thu Sep 13, 2007 1:30 am

ah, ok, if it's "only" slave-killing :)

User avatar
Heavily Tessellated
Posts: 108
Joined: Thu Aug 10, 2006 4:20 pm
Location: Huh?

Post by Heavily Tessellated » Tue Sep 18, 2007 5:39 am

Hmm. Now with the commercial SDK I'm wondering what we can hope for...

Just because those that can afford it doesn't mean they'd buy it if there was an easy way not to. I can't begin to tell you how many architectural sites I've been to where ACAD license disks floated around like candy... and that's straight-up piracy.

Still, c'mon! Assume you had a couple quad Xeons as footrests under your desk, and you want a bit more renderslave control... what would you wish for?

User avatar
OnoSendai
Developer
Posts: 6244
Joined: Sat May 20, 2006 6:16 pm
Location: Wellington, NZ
Contact:

Post by OnoSendai » Wed Sep 19, 2007 12:16 am

Heavily Tessellated:
Hmm, what exactly are you asking for?

I do think that the network renderering processing could be made a lot easier.
I have actually coded a network rendering manager in Java; Java kind of sucks in some ways tho for this usage, for example you need the JRE installed.

User avatar
zsouthboy
Posts: 1395
Joined: Fri Oct 13, 2006 5:12 am

Post by zsouthboy » Wed Sep 19, 2007 1:31 am

Ono:

I'd be happy with a simple secondary listening port opened from each slave, that accepts some well-defined packet (that you specify), to finish sending the last frame, and then terminate.

oogsnoepje
Posts: 35
Joined: Sat Aug 18, 2007 10:20 am

Post by oogsnoepje » Wed Sep 19, 2007 3:41 pm

I still think Indigo should just stop right away when the user requests so.

I mean, you see the current resulting image, and it's that image you want. Why would Indigo have to continue rendering when you've already got what you'd like to have?

That's my point of view at least.

User avatar
Kram1032
Posts: 6649
Joined: Tue Jan 23, 2007 3:55 am
Location: Austria near Vienna

Post by Kram1032 » Thu Sep 20, 2007 2:09 am

'cause you might have set indigo to very low update rate (to make it faster)....
so, the output image might already be far noisefreeer^^

I just had an idea... I wonder, if that would work...
"analyse" the picture for it's amount of noise, and add a scale to that.
(something, like percent, maybe) - it defines, how many percent of the pic still are noisy. The user could define a value, after which he/she thinks, it's noisefree enough. There, the render will stop. :)

User avatar
Heavily Tessellated
Posts: 108
Joined: Thu Aug 10, 2006 4:20 pm
Location: Huh?

Post by Heavily Tessellated » Mon Sep 24, 2007 9:41 am

OnoSendai wrote:Heavily Tessellated:
Hmm, what exactly are you asking for?
Hmm indeed... I'm not entirely sure what I want, that's the point of trying to get a discussion going.

I would really like to see a render queue and some basic queue management, whether this is directly in Indigo or Indigo modified in such a way it can work with things like DrQueue is not important. (although making it integrate OOTB with open source massive renderfarm management software might be contrary to your plans for commercial development) Re: JRE, I think anyone that's capable of using Indigo is capable of downloading and installing java, so if you think your java dealie is ready for public beta, put it up! :D Most of us immediately install actual Sun java JRE/JDK over gcj or Microsoft's java VM when building systems; it's just better.

I/we could really benefit from a smaller memory footprint on the master, so perhaps one box could be a dedicated master for multiple renders. As it stands, even feeding a handful of slaves to a single render sends the memory usage into swap space very fast. I know someone else was saying how hard it was getting 25! slaves to run, I about fell out of my chair... he must have 16G RAM.

A way for a slaves to "check in" would be super nice - to be able to stop a slave and move it to a higher priority render, for instance. While simple enough to implement basic heartbeat/spawn/kill in just a shell script, it's nasty. And without feedback from the client, it would all be blind and brute force. I wouldn't know how to do this in Windows without requiring external apps, perhaps Sysinternals stuff like pslist/psexec/pskill... yet I feel the batch file would end up being so involved it might be better just to code it.

'Course all this depends greatly on what you think falls under the scheme of a render master's job or an external queue manager, be it your java one or something else.

I'm sorry, it's disorganized, it's just a typed-out thought stream. I'm sure everyone has a thought or two on the subject. I'd really like some more people to voice what THEY would like to see...

User avatar
dougal2
Developer
Posts: 2532
Joined: Wed Nov 15, 2006 8:17 am
Location: South London

Post by dougal2 » Mon Sep 24, 2007 10:51 am

I would like to see a system whereby a slave is started with no assumptions about what scene it is going to render. It finds the master and checks itself in for duty.

It would then be the job of the master to dish it a scene file to render (slave could either keep this in memory, store to a temp file, or perhaps the master simply givs it the path of a file on a shared folder somewhere), and keep the relationship going until either the render is complete, or stopped, or whatever.

Perhaps then the master would be able to have some sort of job queue management built on top of that - with spp or elapsed time stop conditions.

Perhaps the master could put slaves into different groups to render several jobs at once (ie, split your farm in half to render 2x as many images at 1/2 the spp).

Just a few ideas, sorry if they've been mentioned already - i haven't read the whole thread.

User avatar
Caronte
Posts: 61
Joined: Tue May 01, 2007 7:17 am
Location: Valencia, Spain

Post by Caronte » Tue Sep 25, 2007 11:10 am

Today time, our computers become obsolete very quickly and it's easy to have more than one at home, I think we need a good and easy method to set up a small renderfarm for Indigo without problems.

I'm a Blender user, but I have a CarraraPro license and I love her netrenderer, because the only thing that I need o do for use all my computers, is push the render button, also I can stop it pushing ESC.
No matter how much times launch or quit renders, I don't have to worry about the other computers any more.
Sorry about my poor english ;)

Post Reply
17 posts

Who is online

Users browsing this forum: Google [Bot] and 9 guests