PDA

View Full Version : Any interest in a cluster tutorial?



Airbozo
03-21-2007, 06:07 PM
I have been tasked here at work to create several different cluster systems and was also asked to create a "worklog" of sorts. My question is; Would there be enough interest here for me to post my "worklog" complete with screen captures and side info?

These "Clusters" will be built on the same systems, with different disks. I am planning on creating a "Web Server Cluster", a "Database Cluster" and a "Visual Cluster" for this project. These "Clusters" will consist of 2 or 3 nodes depending on the type of cluster. A master and 1 or 2 slaves. (The visual cluster will have 1 master and 2 slaves for sure)

Any interest? I know this is a bit above the "Modding" theme here so that is why I ask, although someone could take the cluster approach to a new modding level.

I would be using 1u rackmount servers and software from here (which already has a good tutorial); http://www.rocksclusters.org

public_eyesore
03-21-2007, 06:55 PM
Yes!!!

edit : i got to thinking, yes is a really weird word. Almost like its not english

nil8
03-21-2007, 07:04 PM
Oh yea. Clusters are one area where I know nothing and I think it would be great to see how they're built and work.

SgtM
03-21-2007, 07:15 PM
Oh yea. Clusters are one area where I know nothing and I think it would be great to see how they're built and work.

Same here. I understand the concept, just never had to deploy it. Besides, it might help on my MCSE test.

Bucko
03-21-2007, 08:14 PM
I'd say yes too. I know the term, but not how it is applied, so a good worklog with expand my knowledge.

Go for it!

.Maleficus.
03-21-2007, 08:23 PM
I'd read it for sure. And hey, it may be above the modding thing, but if anyone used it and had 3 computers, that's 3 more projects, right? :D

Commando
03-21-2007, 11:20 PM
Sounds a little more interesting than the "do you like coffee?" thread.

Go for it. I think posts with some actual content are always worth it.

rendermandan
03-22-2007, 12:22 AM
Sorry for being a noob here, but could you define what a cluster is and what it is used for? thanks.

Oh and congrats on the recent promotion to moderator!

Drum Thumper
03-22-2007, 03:36 AM
Like everyone else, I understand the concept, but haven't had the chance to implement it. So please, a tutorial would be greatly appreciated!

progbuddy
03-22-2007, 07:03 AM
On that site you can register your rock cluster :p.

DaveW
03-22-2007, 07:44 AM
Sounds good! I'm avoiding Cluster computing because i know who the lecturer is, and apparently i'll need a good knowledge of C to get through it. I hate C.

-Dave

Drew
03-22-2007, 08:34 AM
Bring it on.

What's a cluster? (I thought it was a breakfast cereal..)

DaveW
03-22-2007, 08:54 AM
It's where you have a lot of computers which share processing power. It's more complicated than that though: a task manager is needed that decided which CPU should compute which tasks.

Folding@home is a kind of cluster computing, so is the SETI thing-although they're pretty advanced technologically, they're fairly primitive systems compared to real grid/cluster computing systems.

-Dave

Drew
03-22-2007, 09:16 AM
So I could 'cluster' two 350mhz PCs and get the effect of a 700Mhz processor, in theory?

chaksq
03-22-2007, 10:18 AM
I would love to hear a project log. I was wondering how I could do that.

This can be done on regularmachines too right not just rackmount? I have way to many useless machines that I could play around with.

nil8
03-22-2007, 10:28 AM
It's where you have a lot of computers which share processing power. It's more complicated than that though: a task manager is needed that decided which CPU should compute which tasks.

Folding@home is a kind of cluster computing, so is the SETI thing-although they're pretty advanced technologically, they're fairly primitive systems compared to real grid/cluster computing systems.

-Dave

Which is why a modified unix engine is the basis for most cluster systems. The way that the unix kernel works suits for cluster work fairly well. Correct me if I'm wrong on this. Like I said before, I'm a noob to clusters.

Distributed computing is somewhat of a world wide cluster, but doesn't give the same effect as seeing a rack full of servers and knowing that they're all linked together, performing a common goal.
Besides that, it's a basic client/server program. It's just keeping their main servers open to process data and handle a large db of information, instead of chewing through all that work themselves.

Drew, with the overhead necessary to deal with the communication, handing out processing orders, etc it would be closer to 650 or 600 mhz. Once again, someone put me in my place if I'm wrong.

Crimson Sky
03-22-2007, 11:19 AM
That would be a very cool tutorial. I have a few miniITX boards I'd like to cluster.

Airbozo
03-22-2007, 11:55 AM
Ok then. I will put together a quick outline for myself and then start a new log. Keep in mind that with the ROCKS Cluster software it seems kind of magical how it works. UCSD has done extensive work on this project so sometimes it seems just too easy. I will start with the web server cluster and then move to a 3 node viz server cluster.

Clusters mean different things to different people. In a webserver cluster you can have a couple different takes on the "cluster" approach. In one scenario called HA (High Availability) clusters, there is one system handling all the work. If that system fails, thye backup node automatically takes over any and all work from the main node. In a compute cluster there is one Master that doles out the jobs to the different nodes based on a pre determined cpu/resource load. What this really means is that when you configure the cluster, you determine at what point you want other nodes to share in the workload. I.E. when node 1 hits 60% of cpu load, the next task is automatically started on node 2. You can even assign _which_ jobs get started on which nodes regardless of the load on any node (damn that was almost poetic!). This technique is called "Load Balancing" and is used on pretty much every large mainframe system out there. I will also have some side notes on each step and why I choose certain options.

EDIT: For a viz server cluster, the master makes sure that all the slave nodes push the visual data at the same time. This insures that the images on the different nodes stay in sync. You could build a large video wall with many nodes each with 2 displays. You then use the master to play a dvd and all nodes output their portion of the display at the same time providing a LARGE display that appears to be one system.


Crimson; I was thinking last night about a miniITX cluster mod! (the wife just gave me this dorky stare as I mumbled it during the hockey game last night) hehe I just may ask one of our vendors to sponsor this...

Crimson Sky
03-22-2007, 12:08 PM
I've seen a few miniitx clusters, this one (http://www.mini-itx.com/projects/cluster/)comes to mind first. 12 x 800Mhz nodes :)

Drew
03-22-2007, 01:25 PM
If this tutorial is n00b friendly, this is gonna rock so hard.

Now we know what to do with all the old crap we all got laying around.

:edit: Do all the Mobos need to be the same?

Airbozo
03-22-2007, 01:29 PM
If this tutorial is n00b friendly, this is gonna rock so hard.

Now we know what to do with all the old crap we all got laying around.

:edit: Do all the Mobos need to be the same?

I will make it as n00b friendly as possible without getting too heavy into details that don't matter (but will cover the details in case someone is interested).

While it is recommended that all the nodes be identical, it is not necessary. For a visual cluster it is more important that the video cards and underlying hardware is similar or there will be noticeable delays and out of sync displays.

Luke122
03-22-2007, 01:34 PM
I'd love to see this also, as I have a load of old PC's here, just waiting to go the recycler. If I can make a use for them, I'd happily put them to work. :)

Canadian Eh?
03-23-2007, 02:28 PM
Plz start this cuz it would be a very cool worklog and i am bored of all the other ones. (no offense guys!)

Airbozo
03-23-2007, 02:54 PM
I am putting together an outline right now so I stay on track and don't drift too far away. I intend to do a 2 parter; One for Installation and one for configuration/operation.

I will post in another thread and start with the hardware inventory and software needed. I will then delay a day or two to allow people to gather the HW and SW in case they want to follow along. I was going to use a couple of rackmount SuperMicros, but instead I will use a couple of older mobos, then for the VIZ cluster I will use the SuperMicros with similar graphics cards (my older mobo's are agp and my good video cards are pci express (nvidia Quadro 1400's).

Drew
03-23-2007, 02:58 PM
:banana:

Learning to come

:banana:

:edit: oops, that reads bad.... learning is coming?

Nope.

The opportunity to learn is upon us.

Better.

Drew
03-23-2007, 04:10 PM
So there is a performance gain then?

:edit: Perhaps I should just wait for the tutorial....

Make sure you cover the practical advantages dude.

Airbozo
03-23-2007, 04:23 PM
........ you are one informative SOB bro.

-Jeremy

Well half right anyway...

Drew
03-23-2007, 04:42 PM
Yeah, that'll be it.....

Airbozo
03-23-2007, 04:46 PM
So there is a performance gain then?

:edit: Perhaps I should just wait for the tutorial....

Make sure you cover the practical advantages dude.

Yes there is a performance gain. It really depends on how it is setup. Just like Minty said, think of it not as 350 + 350 (it really is not a dual proc system), rather as 2 350 mhz _systems_. Each with their own memory and i/o subsystems. Tasks (programs) are taken on by each system based on metrics you setup. System (node) 1 gets 60% busy and the next task is sent to node2 (and so on). Unless specifically programmed for clusters programs will not split up on the different nodes, but different programs will run on different nodes. There are tools available to specify which node a particular program will run on (setup manually by the end user/administrator). I will not cover those tools in the first tutorial. The first one is going to be all about getting the cluster setup and running. The second one I am planning will get into some of the monitoring and tuning tools.

Keep in mind that I am learning this too. My only advantage right now is that I have setup clusters in the past and know some of the gotchas. rocksclusters.org really does make it easy to make a cluster happen. They (UCSD) have put in a great deal of time creating the install programs that make it as easy as it is. Our web server here at work is running an HA cluster (www.mce.com). The systems are not setup to do load balancing or anything, it is just setup so that if the main server crashes, then secondary one takes over without any interruption in service. We do it this way because the second system is also our "test" server. We make changes to that system and if there are no issues, the new setup is "pushed" (using rcp, but that is another lesson) to the main server at night (causing only a minor blip in service as the webserver is restarted).

I am hoping to start posting this tutorial either this weekend or monday. I will more than likely just post the outline and items needed over the weekend, then do the actual work on my systems here at work.

Hehe I never thought being part of TBCS would actually cause me to learn geeky technical stuff (scsi vs ide thread), more than I have... Or to actually teach any of you guys anything. I was just expecting to have a nice purdy computer by now (My main rig is plain, noisy and an ugly beige color with no windows). Hehe go figure!

Drew
03-23-2007, 05:19 PM
Airbozo - the teacher.

Airbozo
03-26-2007, 12:59 PM
I am working on the tutorial and am closing this thread.

Please stand by...