Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3 Questions about the work flow of Petuum #2

Open
JunjieHu opened this issue Sep 21, 2014 · 0 comments
Open

3 Questions about the work flow of Petuum #2

JunjieHu opened this issue Sep 21, 2014 · 0 comments
Assignees
Labels

Comments

@JunjieHu
Copy link
Collaborator

Suppose that there are m clients, where each client has n thread.

Q1:
Based on the code of main function in MF class, it seems that the MF class file will be distributed to these m client machines. Then each machine will run the main function and start n threads to run SolveMF(). Do I understand correctly??

class MatricFactor()
{

void main(){
    //run threads

//** I notice that there are only numWorkerThreads, not getTotalNumWorker() of threads
for(int i = 0; i < numWorkerThreads; i++) {
threads.add(new Thread(new SolveMF(i)));
threads.get(i).start();
}
}
}

Q2:
If I want to assign one block (e.g., Block(i,j)) of the table A in PS to one thread, I will write a mapping function from (globalWorkerId) -> (block position (i,j) ). I implement the following class.
class Block
{
Block(int globalWorkerId){
(i,j) = hash(globalWorkerId);
... // other block information
}
}
Then I cannot create a Vector vecBlock and add each block to the vecBlock in the main function, right?? Because I think that the vecBlock cannot add all the blocks for all the global threads in main function, if I understand correctly in Q1.
For example,
class MetricLearn()
{
main()
{
// cannot do this for loop?
for(i = 1: getTotalWorker())
{
vecBlock.add( Block(i) );
}
}
}
Q3:
In MactrixFactor class, for each thread running initMF() in SolveMF.run(), each thread needs to initialise tableL and tableR (clientTable). Are these tables stored in the thread cache? If so, doesn't it take too large space for each thread to store these two tables in a client machine? If these two tables are stored globally in PS, why do we need to initial in each thread's SolveMF.run() function?

Junjie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants