Home arrow static arrow Java Programming [Archive] - Multithreading
Warning: Creating default object from empty value in /www/htdocs/w008deb8/wiki/components/com_staticxt/staticxt.php on line 51
Java Programming [Archive] - Multithreading
This topic has 23 replies on 2 pages.    « Previous | 1 | 2 |

Posts:6,750
Registered: 1/25/04
Re: Multithreading  
Jun 16, 2004 9:46 AM (reply 15 of 23)



 
I don't know - I don't really understand your description of the process.
 

Posts:97
Registered: 1/27/04
Re: Multithreading  
Jun 16, 2004 11:01 AM (reply 16 of 23)



 
okay lemme try again:

I am writing the Hierarchical clustering algorithm.
[quote]Given a set of N items to be clustered, and an NxN distance (or similarity) matrix, the basic process of Johnson's (1967) hierarchical clustering is this:

1. Start by assigning each item to its own cluster, so that if you have N items, you now have N clusters, each containing just one item. Let the distances (similarities) between the clusters equal the distances (similarities) between the items they contain.
2. Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one less cluster.
3. Compute distances (similarities) between the new cluster and each of the old clusters.
4. Repeat steps 2 and 3 until all items are clustered into a single cluster of size N.
[/quote]

So I am doing #2 using threads. Then I do the work for merging the clusters and then re-run the same thing.

Does this make sense more now?
 

Posts:6,750
Registered: 1/25/04
Re: Multithreading  
Jun 16, 2004 11:08 AM (reply 17 of 23)



 
Yep. I don't think I know enough about threading to help you, though.
 

Posts:37,103
Registered: 3/30/99
Re: Multithreading  
Jun 16, 2004 12:03 PM (reply 18 of 23)



 
1. Start by assigning each item to its own cluster, so
that if you have N items, you now have N clusters,
each containing just one item. Let the distances
(similarities) between the clusters equal the
distances (similarities) between the items they
contain.
2. Find the closest (most similar) pair of clusters
and merge them into a single cluster, so that now you
have one less cluster.
3. Compute distances (similarities) between the new
cluster and each of the old clusters.
4. Repeat steps 2 and 3 until all items are clustered
into a single cluster of size N.
[/quote]

So I am doing #2 using threads. Then I do the work for
merging the clusters and then re-run the same thing.

I'm not sure I follow, but I don't see why you can't use a queue (seems more appropriate than a stack, but I could be wrong) of "work units."

I can picture two different models here, and either one is viable with a queue (or stack, I guess, if that makes more sense).

1) Start with N work units: W-0..W-N. Each of these generates some new number (M-sub-N) of work units: W0-0..W0-MO through WN-0..WN-MN (I hope this notation is making sense.) Each new work unit is indepedent of anything that came before it or as part of the same step that generated it. Each work unit stands on its own, containing enough information to generate its M-sub-N work units. So once you complete one WU, you put its results in the queue, and the next available thread will pick it up, until you've processed everything. It's okay for work units to be done out of order.

2) Same initial situation, but this time, you have to wait for all the results of the first set of compuatations before you can create the next set of WUs. This is doable in a similar fasion, but is slighly more complex because instead of each worker thread just stuffing its results in the queue and then moving on, you have these synchronization points after each step before moving on. (This lends itself well to a UML activity diagram.) So you basically need a smarter dipatcher that puts the WUs int the queue, tells the worker threads to have at it, watches for them to be done (possibly by Thread.join, but there are other ways you could do it), then takes their accumulated results, maybe does some inter-iteration processing, puts these into the queue as the next set of WUs, and repeats the process.

Neither of the above is necessarily that complex to write, but they can be tricky to get right if you're not familiar with multithreading. And of course, you still have the issue that if you can't make you WUs big enough, you'll eat up more time managing the queue than you will actually computing results.
 

Posts:6,750
Registered: 1/25/04
Re: Multithreading  
Jun 16, 2004 12:06 PM (reply 19 of 23)



 
I'm not sure I follow, but I don't see why you can't
use a queue (seems more appropriate than a stack, but
I could be wrong) of "work units."

Right, I meant queue.
 

Posts:31,095
Registered: 4/30/99
Re: Multithreading  
Jun 16, 2004 12:28 PM (reply 20 of 23)



 
So I am doing #2 using threads. Then I do the work for
merging the clusters and then re-run the same thing.

Your original post was complaining that using threads actually took longer than not using threads. And from your brief description of the task, the threads are totally compute-intensive (i.e. they never have to wait for database access or anything like that), so that explains why the total clock time is equal to the total compute time plus thread-switching overhead. I believe it was jverd who went through that in more detail earlier. Given that, unless there's something else I don't know, the conclusion I arrive at is that using threads is pointless in this case.

By the way, you do realize that the distances matrix is symmetric so you only have to compute the part above (or below) the diagonal, right? That's the first improvement I made when I was given a program implementing that algorithm as my first major assignment.

PC²
 

Posts:37,103
Registered: 3/30/99
Re: Multithreading  
Jun 16, 2004 12:50 PM (reply 21 of 23)



 
Your original post was complaining that using threads
actually took longer than not using threads. And from
your brief description of the task, the threads are
totally compute-intensive

Good point. I'd lost track of that.

Now, if you have more than one CPU (which I think you said you do), then you might get a benefit from having as many threads as CPUs.

However...

* As I pointed out earlier, it requires that the amount of work that one thread can do at a time is large compared to the cost of accessing any shared resources (including memory just by entering & leaving a sync block) per unit of work.

* There's no guarantee that the OS will give each Java thread its own CPU. It may just switch them in and out sequentially on a single CPU.

* You said G5, right? Mac? Then there's the possibility of the exact opposite of the above: I thought I read somewhere a while back that OS X can multithread the VM itself, even if you don't explicitly use threads in Java, so when you use a single thread in your Java code, you may already be taking as much advantage of both CPUs as you're going to get. I don't know any details about this, but it's one thing to consider if you're not seeing what you think should happen.

 

Posts:97
Registered: 1/27/04
Re: Multithreading  
Jun 16, 2004 1:04 PM (reply 22 of 23)



 
#2 is what applies to my case.

Okay the way it works is that I start with K things to do, I run the threads and then do some inter-iteration processing then decrement K by one and then repeat things again. The amount of work threads do each iteration is very dependent on K.

I think I know how to fix this problem, but I am not sane enough at the time (3 hrs sleep) in the past 30 hrs. and my brain has been harrassed for last whole week. So I guess I will try that and update.

I did however force creating less threads when workload not high and the performance did improve. So now I can at least rest assure what the problem is, which I wasn't before. I thought synchronized synchronizing was the issue so i did away with that as you will note from my earlier posts. But that didn't help.
 

Posts:97
Registered: 1/27/04
Re: Multithreading  
Jun 16, 2004 1:16 PM (reply 23 of 23)



 
okay its p690 machine using dual POWER4 chips. The compiler being used is JRE1.4.1 AIX IBM with JIT enabled.

I have more than one algorithm that I need to write. There is no problem with the other algorithms because of their not so iterativeness.

By the way, you do realize that the distances matrix is symmetric so you only have to compute the part above (or below) the diagonal, right? That's the first improvement I made when I was given a program implementing that algorithm as my first major assignment.

DrClap, that's not the only optimization I have made. There are insanely many more optimizations I have made. One of the optimization is to use O(C*N) memory, C is a constant and only say 5-10% at worse performance hit compared to using O(N*N/2) memory with small N value. Using O(N*N/2) memory has its side-effects because of the overhead of updating it each iteration as N becomes larger. So if N were to equal say 6000, my O(C*N) runs roughly 10x faster. This is not to mention in single thread version. Given that I have been told that this will be used where N could be as large as 20K, you can imagine how much performance improvement mine will have.
 
This topic has 23 replies on 2 pages.    « Previous | 1 | 2 |